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(54) SOUND SOURCE VECTOR GENERATOR, VOICE ENCODER, AND VOICE DECODER 



(57) A random code vector rearing section and a 
random codebook of a conventional CELP type speech 
coder/decoder are respectively replaced with an oscilla- 
tor for ouiputting different vector streams in accordance 
with values of input seeds, and a seed storage section 



for storing a plurality of seeds . This makes it unneces- 
sary to store fixed vectors as they are in a fixed code- 
book (ROM), thereby considerably reducing the 
memory capacity. 
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Description 
Technical Field 

s The present invention relates to an excitation vector generator capable of obtaining a high -quality synthesized 
speech, and a speech coder and a speech decoder which can code and decode a high-quality speech signal at a low 
bit rate. 

Background Art 

w 

A CELP (Code Excited Linear Prediction) type speech coder executes linear prediction for each ol frames obtained 
by segmenting a speech at a given time, and codes predictive residuals (excitation signals) resulting from the frame-by- 
frame linear prediction, using an adaptive codebook having old excitation vectors stored therein and a random code- 
book which has a plurality of random code vectors stored therein. For instance, "Code-Excited Linear Prediction (CELP) 
is : High-Quality Speech at Very Low Bit Rate," M. R. Schraeder, Proc. ICASSP '85, pp. 937-940 discloses a CELP type 
speech coder. 

FIG. t illustrates the schematic structure of a CELP type speech coder. The CELP type speech coder separates 
vocal information into excitation information and vocal tract information and codes them. With regard to the vocal tract 
information, an input speech signal 10 is input to a filter coefficients analysis section 11 for linear prediction and linear 

20 predictive coefficients (LPCs) are coded by a filter coefficients quantization section 12. Supplying the linear predictive 
coefficients to a synthesis filter 13 allows vocal tract information to be added to excitation information in the synthesis 
filter 13. With regard to the excitation information, excitation vector search in an adaptive codebook H and a random 
codebook 15 is carried out for each segment obtained by further segmenting a frame (called subframe). The search in 
the adaptive codebook 1 4 and the search in the random codebook 1 5 are processes of determining the code number 

25 and gain (pitch gain) of an adaptive code vector, which minimizes coding distortion in an equation 1, and the code 
number and gain (random code gain) of a random code vector. 



\\v-(gaH P + S cHc)\\ ! (1) 



35 v: speech signal (vector) 

H: impulse response convolution matrix of the 
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so synthesis filter. 

where h: impulse response (vector) of the synthesis filter 
L: frame length 
p: adaptive code vector 
55 c: random code vector 

ga: adaptive code gain (pitch gain) 
gc: random code gain 
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Because a closed loop search of the code that minimizes the equation 1 involves a vast amount of computation for 
the code search, however, an ordinary CELP type speech coder first performs adaptive codebook search to specify the 
code number of an adaptive code vector, and then executes random codebook search based on the searching result to 
specify the code number of a random code vector. 
5 The speech coder search by the CELP type speech coder will now be explained with reference to FIGS. 2A through 
2C. In the figures, a code x is a target vector for the random codebook search obtained by an equation Z It is assumed 
that the adaptive codebook search has already been accomplished. 



" x = v - gaHp (2) 



15 

where x: target (vector) for the random codebook search 
v: speech signal (vector) 

H: impulse response convolution matrix H of the synthesis filter 
p: adaptive code vector 
20 ga: adaptive code gain (pitch gain) 

The random codebook search is a process of specifying a random code vector c which minimizes coding distortion 
that is defined by an equation 3 in a distortion calculator 16 as shown in FIG. 2A. 

||x-scHc|| 2 (3) 



30 

where x: target (vector) for the random codebook search 
H: impulse response convolution matrix of the synthesis filter 
c: random code vector 
gc: random code gain. 

35 

The distortion calculator 1 6 controls a control switch 2 1 to switch a random code vector to be read from the random 
codebook 15 until the random code vector c is specif ied 

An actual CELP type speech coder has a structure in FIG. 2B to reduce the computational complexities, and a dis- 
tortion calculator 16' carries out a process of specifying a code number which maximizes a distortion measure in an 
40 equation 4. 

(*'Hc) : ((x'H)c)- (x"c) 2 (x"c) : 
||Hc|| J " ||Hc|| z " ||Hc|f c'H'Hc 

45 



where x: target (vector) for the random codebook search 
so H: impulse response convolution matrix of the synthesis filter 
H ! : transposed matrix of H 
X*: time reverse synthesis of x using H 
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c: random code vector. 

Specifically, the random codebook control switch 21 is connected to one terminal of the random codebook 15 and 
the random code vector c is read from an address corresponding to that terminal. The read random code vector c is 

s synthesized with vocal tract information by the synthesis filter 13, producing a synthesized vector He. Then, the distor- 
tion calculator 16' computes a distortion measure in the equation 4 using a vector X obtained by a time reverse process 
of a target x, the vector He resulting from synthesis of the random code vector in the synthesis filter and the random 
code vector c. As the random codebook control switch 21 is switched, computation of the distortion measure is per- 
formed for every random code vector in the random codebook. 

w Finally, the number of the random codebook control switch 21 that had been connected when the distortion meas- 
ure in the equation 4 became maximum is sent to a code output section 17 as the code number of the random code 
vector. 

FIG. 2C shows a partial structure of a speech decoder. The switching of the random codebook control switch 21 is 
controlled in such a way as to read out the random code vector that has a transmitted code number. After a transmitted 
ts random code gain gc and filter coefficient are set in an amplifier 23 and a synthesis filter 24, a random code vector is 
read out to restore a synthesized speech. 

In the above-described speech coder/speech decoder, the greater the number of random code vectors stored as 
excitation information in the random codebook 1 5 is, the more possible it is to search a random code vector close to the 
excitation vector of an actual speech. As the capacity of the random codebook (ROM) is limited however, it is not pos- 
se sible to store countless random code vectors corresponding to all the excitation vectors in the random codebook. This 
restricts improvement on the quality of speeches. 

Also has proposed an algebraic excitation which can significantly reduce the computational complexities of coding 
distortion in a distortion calculator and can eliminate a random codebook (ROM) (described in "8 KBIT/S ACE LP COD- 
ING OF SPEECH WITH 10 MS SPEECH-FRAME: A CANDIDATE FOR CCITT STANDARDIZATION": R. Salami, C. 
25 Laflamme. J-P. Adoul, ICASSP '94, pp. M-97 to 11-100, 1994). 

The algebraic excitation considerably reduces the complexities of computation of cooing distortion by previously 
computing the results of convolution of the impulse response of a synthesis filter and a time-reversed target and the 
autocorrelation of the synthesis litter and developing them in a memory. Further, a ROM in which random code vectors 
have been stored is eliminated by algebraically generating random code vectors. A CS-ACELP and ACELP which use 
30 the algebraic excitation have been recommended respectively as G. 729 and G. 723.1 from the ITU-T 

In the CELP type speech coder/speech decoder equipped with the above-described algebraic excitation in a ran- 
dom codebook section, however, a target for a random codebook search is always coded with a pulse sequence vector, 
which puts a limit to improvement on speech quality. 

35 Disclosure of Invention 

It is therefore a primary object of the present invention to provide an excitation vector generator, a speech coder 
and a speech decoder, which can significantly suppress the memory capacity as compared with a case where random 
code vectors are stored directly in a random codebook. and can improve the speech quality 

40 It is a secondary object of this invention to provide an excitation vector generator, a speech coder and a speech 
decoder, which can generate complicated random code vectors as compared with a case where an algebraic excitation 
is provided in a random codebook section and a target for a random codebook search is coded with a pulse sequence 
vector, and can improve the speech quality. 

In this invention, the fixed code vector reading section and fixed codebook of a conventional CELP type speech 

45 coder/decoder are respectively replaced with an oscillator, which outputs different vector sequences in accordance with 
the values of input seeds, and a seed storage section which stores a pluraGty of seeds (seeds of the oscillator). This 
eliminates the need for fixed code vectors to be stored directly in a fixed codebook (ROM) and can thus reduce the 
memory capacity significantly. 

Further, according to this invention, the random code vector reading section and random codebook of the conven- 

so tional CELP type speech coder/decoder are respectively replaced with an oscillator and a seed storage section. This 
eliminates the need for random code vectors to be stored directly in a random codebook (ROM) and can thus reduce 
the memory capacity significantly. 

The invention is an excitation vector generator which is so designed as to store a pluraBty of fixed waveforms, 
arrange the individual fixed waveforms at respective start positions based on start position candidate information and 

55 add those fixed waveforms to generate an excitation vector. This can permit an excitation vector close to an actual 
speech to be generated. 

Further, the invention is a CELP type speech coder/decoder constructed by using the above excitation vector gen- 
erator as a random codebook. A fixed waveform arranging section may algebraically generate start position candidate 
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information erf fixed waveforms. 

Furthermore, the invention is a CELP type speech coder/decoder, which stores a plurality of fixed 'waveforms, gen- 
erates an impulse with respect to start position candidate information of each fixed waveform, convolutes the impulse 
response of a synthesis filter and each fixed waveform to generate an impulse response for each fixed waveform, com- 

5 pules the autocorrelations and correlations of impulse responses of the individual fixed waveforms and develop them in 
a correlation matrix. This can provide a speech coder/decoder which improves the quality of a synthesized speech at 
about the same computation cost as needed in a case of using an algebraic excitation as a random codebook. 

Moreover, this invention is a CELP type speech coder/decoder equipped with a plurality of random codebooks and 
switch means for selecting one of the random codebooks. At least one random codebook may be the aforementioned 

w excitation vector generator, or at least one random codebook may be a vector storage section having a plurality of ran- 
dom number sequences stored therein or a pulse sequences storage section having a plurality of random number 
sequences stored therein, or at least two random codebooks each having the aforementioned excitation vector gener- 
ator may be provided with the number of fixed waveforms to be stored differing from one random codebook to another, 
and the switch means selects one of the random codebooks so as to minimize coding distortion at the time of searching 

is a random codebook or adaptively selects one random codebook according to the result of analysis of speech seg- 
ments. 

Brief Description of Drawings 

20 FIG. 1 is a schematic diagram of a conventional CELP type speech coder; 

FIG. 2A is a block diagram of an excitation vector generating section in the speech coder in FIG. 1 ; 

FIG. 2B is a block diagram of a modification of the excitation vector generating section which is designed to reduce 

the computation cost; 

FIG. 2C is a block diagram of an excitation vector generating section in a speech decoder which is used as a pair 
25 with the speech coder in FIG. 1 ; 

FIG. 3 is a block diagram of the essential portions of a speech coder according to a first mode; 

FIG. 4 is a block diagram of an excitation vector generator equipped in the speech coder of the first mode; 

FIG. 5 is a block diagram of the essential portions of a speech coder according to a second mode; 

FIG. 6 is a block diagram of an excitation vector generator equipped in the speech coder of the second mode; 
30 FIG. 7 is a block diagram of the essential portions of a speech coder according to third and fourth modes; 

FIG. 8 is a block diagram of an excitation vector generator equipped in the speech coder of the third mode; 

FIG. 9 is a block diagram of a non-linear digital filter equipped in the speech coder of the fourth mode; 

FIG. 10 is a diagram of the adder characteristic of the non-linear cSgitaJ filter shown in FIG. 9; 

FIG. 11 is a block diagram of the essential portions of a speech coder according to a fifth mode; 
35 FIG. 1 2 is a block diagram of the essential portions of a speech coder according to a sixth mode; 

FIG. 13A is a block diagram of the essential portions of a speech coder according to a seventh mode; 

FIG. 1 3B is a block diagram of the essential portions of the speech coder according to the seventh mode; 

FIG. 14 is a block diagram of the essential portions of a speech decoder according to an eighth mode; 

FIG. 1 5 is a block diagram of the essential portions of a speech coder according to a ninth mode; 
40 FIG. 1 6 is a block diagram of a quantization target LSP adding section equipped in the speech coder according to 

the ninth mode; 

FIG. 1 7 is a block diagram of an LSP quantizing/decoding section equipped in the speech coder according to the 
ninth mode; 

FIG. 18 is a block diagram of the essential portions of a speech coder according to a tenth mode; 
45 FIG. 19A is a block diagram of the essential portions of a speech coder according to an eleventh mode; 

FIG. 1 9B is a block diagram of the essential portions of a speech decoder according to the eleventh mode; 

FIG. 20 is a block diagram of the essential portions of a speech coder according to a twelfth mode; 

FIG. 21 is a block diagram of the essential portions of a speech coder according to a thirteenth mode; 

FIG. 22 is a block diagram of the essential portions of a speech coder according to a fourteenth mode; 
so FIG. 23 is a block diagram of the essential portions of a speech coder according to a fifteenth mode; 

FIG. 24 is a block diagram of the essential portions of a speech coder according to a sixteenth mode; 

FIG. 25 is a block diagram of a vector quantizing section in the sixteenth mode; 

FIG. 26 is a block diagram of a parameter coding section of a speech coder according to a seventeenth mode; and 
FIG. 27 is a block diagram of a noise canceler according to an eighteenth mode. 

55 

Best Modes for Carrying Out the Invention 

Preferred modes of the present invention will now be described specifically with reference to the accompanying 
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drawings. 
(First Mode) 

5 FIG. 3 is a block diagram of the essential portions oi a speech coder according to this mode. This speech coder 
comprises an excitation vector generator 30, which has a seed storage section 31 and an oscillator 32, and an LPC syn- 
thesis filter 33. 

Seeds {oscillation seeds) 34 output from the seed storage section 31 are input to the oscfllator 32. The oscillator 
32 outputs different vector sequences according to the values of the input seeds. The oscillator 32 oscillates with the 
w content according to the value of the seed (oscillation seed) 34 and outputs an excitation vector 35 as a vector 
sequence. The LPC synthesis filter 33 is supplied with vocal tract information in the form of the impulse response con- 
volution matrix of the synthesis filter, and performs convolution on the excitation vector 35 with the impulse response, 
yielding a synthesized speech 36. The impulse response convolution of the excitation vector 35 is called LPC synthesis. 
FIG. 4 shows the specific structure the excitation vector generator 30. A seed to be read from the seed storage see- 
rs tion 31 is switched by a control switch 41 for the seed storage section in accordance with a control signal given from a 
distortion calculator. 

Simple storing of a plurality of seeds for outputting different vector sequences from the oscillator 32 in the seed stor- 
age section 31 can allow more random code vectors to be generated with less capacity as compared with a case where 
complicated random code vectors are rirectly stored in a random codebook. 
20 Although this mode has been described as a speech coder, the excitation vector generator 30 can be adapted to a 
speech decoder. In this case, the speech decoder has a seed storage section with the same contents as those of the 
seed storage section 31 of the speech coder and the control switch 41 for the seed storage section is supplied with a 
seed number selected at the time of coding. 



25 (Second Mode) 

FIG. 5 is a block diagram of the essential portions of a speech coder according to this mode. This speech coder 
comprises an excitation vector generator 50, which has a seed storage section 51 and a non-linear oscillator 52, and 
an LPC synthesis filter 53. 

30 Seeds (oscillation seeds) 54 output from the seed storage section 51 are input to the non-linear oscillator 52. An 
excitation vector 55 as a vector sequence output from the non-linear oscillator 52 is input to the LPC synthesis filter 53. 
The output of the LPC synthesis filter 53 is a synthesized speech 56. 

The non-linear oscillator 52 outputs different vector sequences according to the values of the input seeds 54, and 
the LPC synthesis filter 53 performs LPC synthesis on the input excitation vector 55 to output the synthesized speech 

35 56. 

FIG. 6 shows the functional blocks of the excitation vector generator 50, A seed to be read from the seed storage 
section 51 is switched by a control switch 41 for the seed storage section in accordance with a control signal given from 
a distortion calculator. 

The use of the non-linear oscillator 52 as an oscillator in the excitation vector 50 can suppress divergence with 
40 oscillation according to the non-linear characteristic, and can provide practical excitation vectors. 

Although this mode has been described as a speech coder, the excitation vector generator 50 can be adapted to a 
speech decoder. In this case, the speech decoder has a seed storage section with the same contents as those of the 
seed storage section 51 of the speech coder and the control switch 41 for the seed storage section is supplied with a 
seed number selected at the time of coding. 

45 

(Third Mode) 



FIG. 7 is a block diagram of the essential portions of a speech coder according to this mode. This speech coder 
comprises an excitation vector generator 70, which has a seed storage section 71 and a non-linear digital filter 72, and 
so an LPC synthesis filter 73. In the diagram, numeral "74" denotes a seed (osc9lation seed) which is output from the seed 
storage section 71 and input to the non-linear digital filter 72, numeral "75" is an excitation vector as a vector sequence 
output from the non-linear digital filter 72, and numeral "76" is a synthesized speech output from the LPC synthesis filter 
73. 

The excitation vector generator 70 has a control switch 41 for the seed storage section which switches a seed to 
55 be read from the seed storage section 71 in accordance with a control signal given from a distortion calculator, as 
shown in FIG. 8. 

The non-linear cfigrtal fflter 72 outputs different vector sequences according to the values of the input seeds, and 
the LPC synthesis filter 73 performs LPC synthesis on the input excitation vector 75 to output the synthesized speech 
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76. 

The use of the non-linear digital filter 72 as an oscillator in the excitation vector 70 can suppress divergence with 
oscillation according to the non-linear characteristic, and can provide practical excitation vectors. Although this mode 
has been described as a speech coder, the excitation vector generator 70 can be adapted to a speech decoder. In this 
s case, the speech decoder has a seed storage section with the same contents as those of the seed storage section 71 
of the speech coder and the control switch 41 for the seed storage section is supplied with a seed number selected at 
the time of coding. 

(Fourth Mode) 

10 

A speech coder according to this mode comprises an excitation vector generator 70, which has a seed storage sec- 
tion 71 and a non-linear digital filter 72, and an LPC synthesis filter 73, as shown in FIG. 7. 

Particularly, the non-linear digital filter 72 has a structure as depicted in FIG. 9. This non-linear digital filter 72 
includes an adder 91 having a non-linear adder characteristic as shown in FIG. 10. filter state holding sections 92 to 93 

is capable of retaining the states (the values of y(k-1) to y(k-N)) of the digital fiter, and multipliers 94 to 95, which are con- 
nected in parallel to the outputs of the respective filter state holding sections 92-93, multiply filter states by gains and 
output the results to the adder 91 . The initial values of the filter states are set in the filter state holding sections 92-93 
by seeds read from the seed storage section 71 . The, values of the gains of the multipliers 94-95 are so fixed that the 
polarity of the digital filter lies outside a unit circle on a Z plane. 

20 FIG. 10 is a conceptual diagram of the non-linear adder characteristic of the adder 91 equipped in the non-linear 
digital filter 72, and shows the input/output relation of the adder 91 which has a 2's complement characteristic. The 
adder 91 first acquires the sum of adder inputs or the sum of the input values to the adder 91 , and then uses the non- 
linear characteristic illustrated in FIG. 10 to compute an adder output corresponding to the input sum. 

In particular, the non-linear digital filter 72 is a second-order all-pole model so that the two filter state holding sec- 

25 tions 92 and 93 are connected in series, and the multipliers 94 and 95 are connected to the outputs of the filter state 
holding sections 92 and 93. Further, the digital filter in which the non-linear adder characteristic of the adder 91 is a 2's 
complement characteristic is used. Furthermore, the seed storage section 71 retains seed vectors of 32 words as par- 
ticularly described in Table 1. 

30 

Table 1 



Seed vectors for generating random code vectors 




Sy(n-1)[i] 


Sy(n-2)[i] 


i 


Sy(n-1)[i] 


Sy(n-2)P] 


1 


0.250000 


0.250000 


9 


0.109521 


-0.761210 


2 


-0.564643 


-0.104927 


10 


-0.202115 


0.198718 


3 


0.173879 


-0.978792 


11 


-0.095041 


0.863849 


4 


0.632652 


0.951133 


12 


-0.634213 


0.424549 


5 


0.920360 


-0.113881 


13 


0.948225 


-0.184861 


6 


0.864873 


-0.860368 


14 


-0.958269 


0.969458 


7 


0.732227 


0.497037 


15 


0.233709 


-0.057248 


8 


0.917543 


-0.035103 


16 


-0.852085 


-0.564948 



In the thus constituted speech coder, seed vectors read from the seed storage section 71 are given as initial values 
to the filter state holding sections 92 and 93 of the non-linear digital filter 72. Every time zero is input to the adder 91 

so from an input vector (zero sequences), the non-linear digital filter 72 outputs one sanple (y(k)) at a time which is 
sequentially transferred as a filter state to the filter state holding sections 92 and 93. At this time, the multipliers 94 and 
95 multiply the filter states output from the filter state holding sections 92 and 93 £y gains a1 and a2 respectively. The 
adder 91 adds the outputs of the multipliers 94 and 95 to acquire the sum of the adder inputs, and generates an adder 
output which is suppressed between+1 to -1 based on the characteristic in FIG. 10. This adder output (y(k+1)) is output 

55 as an excitation vector and is sequentially transferred to the filter state holding sections 92 and 93 to produce a new 
sample (y(k+2)). 

Since the coefficients 1 to N of the multipliers 94-95 are fixed so thai particularly the pdes of the non-linear digital 
filter lies outside a unit circle on the Z plane according to this mode, thereby providing the adder 91 with a non-linear 
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adder characteristic, the divergence of the output can be suppressed even when the input to the non-linear digitaJ filter 
72 becomes large, and excitation vectors good for practical use can be kept generated. Further, the randomness of 
excitation vectors to be generated can be secured. 

Although this mode has been described as a speech coder, the excrtation vector generator 70 can be adapted to a 
s speech decoder. In this case, the speech decoder has a seed storage section with the same contents as those of the 
seed storage section 71 of the speech coder and the control switch 41 for the seed storage section is supplied with a 
seed number selected at the time of coding. 

(Fifth Mode) 

w 

FIG. 1 1 is a block diagram of the essentia) portions of a speech coder according to this mode. This speech coder 
comprises an excitation vector generator 1 10, which has an excrtation vector storage section 1 1 1 and an added-excita- 
tion-vector generator 1 1 2, and an LPC synthesis filter 113. 

The excrtation vector storage section 111 retains old excitation vectors which are read by a control switch upon 

is reception of a control signal from an unillustrated distortion calculator. 

The added-excrtation-vector generator 1 12 performs a predetermined process, indicated by an added-excrtation- 
vector number excitation vector, on an old excitation vector read from the storage section 1 1 1 to produce a new excita- 
tion vector. The added-excitation-vector generator 1 12 has a function of switching the process content for an old exci- 
tation vector in accordance with the added-excitation-vector number. 

20 According to the thus constituted speech coder, an added-excitation-vector number is given from the distortion cal- 
culator which is executing, for example, an excitation vector search. The added-excitation-vector generator 1 12 exe- 
cutes different processes on old excitation vectors depending on the value ol the input added-excitation-vector number 
to generate different added excitation vectors, and the LPC synthesis filter 113 performs LPC synthesis on the input 
excitation vector to output a synthesized speech. 

25 According to this mode, random excitation vectors can be generated simply by storing fewer old excitation vectors 
in the excrtation vector storage section 1 1 1 and switching the process contents by means of the added-excitation-vector 
generator 112. and it is unnecessary to store random code vectors directly in a random codebook (ROM). This can sig- 
nificantly reduce the memory capacity. 

Although this mode has been described as a speech coder, the excitation vector generator 1 10 can be adapted to 

30 a speech decoder. In this case, the speech decoder has an excitation vector storage section with the same contents as 
those of the excitation vector storage section 1 1 1 of the speech coder and an added-excitation-vector number selected 
at the time of coding is given to the added-excitation-vector generator 112. 

(Sixth Mode) 

35 

FIG. 1 2 shows the functional blocks of an excitation vector generator according to this mode. This excitation vector 
generator comprises an added-excrtation-vector generator 120 and an excitation vector storage section 121 where a 
plurality of element vectors 1 to N are stored. 

The added-excitation-vector generator 120 includes a reading section 122 which performs a process of reading a 

40 plurality of element vectors of different lengths from different positions in the excitation vector storage section 121 . a 
reversing section 123 which performs a process of sorting the read element vectors in the reverse order, a multiplying 
section 124 which performs a process of multiplying a plurality of vectors after the reverse process by different gains 
respectively, a decimating section 125 which performs a process of shortening the vector lengths of a plurality of vectors 
after the multiplication, an interpolating section 126 which performs a process of lengthening the vector lengths of the 

45 thinned vectors, an adding section 127 which performs a process of adding the interpolated vectors, and a process 
determining/instructing section 128 which has a function of determining a specific processing scheme according to the 
value of the input added-excitation-vector number and instructing the incfividual sections and a function of holding a 
conversion map (Table 2) between numbers and processes which is referred to at the time of determining the specific 
process contents. 

so 



Table 2 



Conversion map between numbers and processes 


Bit stream(MS...LSB) 


6 


5 


4 


3 


2 


1 


0 


VI reading position (16 kinds) 








3 


2 


1 


0 


V2 reading position (32 kinds) 


2 


1 


0 






4 


3 



8 



# 
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Table 2 (continued) 



Conversion map between numbers and processes 


Bit stream(MS...LSB) 


6 


5 


4 


3 


2 


1 


0 


V3 reading position (32 kinds) 


4 


3 


2 


1 


0 






Reverse process (2kinds) 














0 


Multiplication (4 kinds) 


1 


•0 












decimating process (4 kinds) 








1 


0 






interpolation (2 kinds) 






0 











The added-excitation-vector generator 120 will now be described more specifically. The added -excitation-vector 
rs generator 1 20 determines specific processing schemes for the reading section 1 22, the reversing section 123, the mul- 
tiplying section 124, the decimating section 125, the interpolating section 126 and the adding section 127 by comparing 
the input added-excitation-vector number (which is a sequence of 7 bits taking any integer value from 0 to 127) with the 
conversion map between numbers and processes (Table 2), and reports the specific processing schemes to the respec- 
tive sections. 

20 The reading section 122 f'rst extracts an element vector 1 (V1) of a length of 100 from one end of the excitation 
vector storage section 1 21 to the position of n 1 , paying attention to a sequence of the lower four bits of the input added- 
excitation-vector number (n1 : an integer value from 0 to 15). Then, the reading section 122 extracts an element vector 
2 (V2) of a length of 78 from the end of the excitation vector storage section 121 to the position of n2+14 (an integer 
value from 14 to 45), paying attention to a sequence of five bits (n2: an integer value from 14 to 45) having the lower 

25 two bits and the upper three bits of the input added-excitation-vector number linked together. Further, the reading sec- 
tion 1 22 performs a process of extracting an element vector 3 (V3) of a length of Ns (= 52) from one end of the excitation 
vector storage section 121 to the position of n3+46 (an integer value from 46 to 77), paying attention to a sequence of 
the upper five bits of the input added-excitation-vector number (n3: an integer value from 0 to 31), and sending V1 , V2 
and V3 to the reversing section 123. 

30 The reversing section 1 23 performs a process of sending a vector having V1 , V2 and V3 rearranged in the reverse 
order to the multiplying section 124 as new VI, V2 and V3 when the least significant bit of the added-excitation-vector 
number is "0" and sending V1 , V2 and V3 as they are to the multiplying section 124 when the least significant bit is "1 

Paying attention to a sequence of two bits having the upper seventh and sixth bits of the added-excitation-vector 
number linked, the multiplying section 1 24 multiplies the amplitude of V2 by -2 when the bit sequence is "00," multiplies 

35 the amplitude of V3 by -2 when the bit sequence is "01 multiplies the amplitude of V1 by -2 when the bit sequence is 
"10" or multiplies the amplitude of V2 by 2 when the bit sequence is "1 1 and sends the result as new V1 , V2 and V3 
to the decimating section 125. 

Paying attention to a sequence of two bits having the upper fourth and third bits of the added-excitation-vector 
number linked, the decimating section 125 

40 

(a) sends vectors of 26 samples extracted every other sample from V1 , V2 and V3 as new V1 , V2 and V3 to the 
interpolating section 1 26 when the bit sequence is "00." (b) sends vectors of 26 samples extracted every other sam- 
ple from VI and V3 and every third sample from V2 as new V1 , V3 and V2 to the interpolating section 126 when 
the bit sequence is "01/ 

45 (c) sends vectors of 26 samples extracted every fourth sample from V1 and every other sample from V2 and V3 as 
new V1, V2 and V3 to the interpolating section 126 when the bit sequence is "10," and 

(d) sends vectors of 26 samples extracted every fourth sample from V1 , every third sample from V2 and every other 
sample from V3 as new V1 , V2 and V3 to the interpolating section 1 26 when the bit sequence is "1 1 

so Paying attention to the upper third bit of the added-excitation-vector number, the interpolating section 126 

(a) sends vectors which have V1, V2 and V3 respectively substituted in even samples of zero vectors of a length 
Ns (= 52) as new V1. V2 and V3 to the adding section 127 when the value of the third bit is "0" and 

(b) sends vectors which have V1 , V2 and V3 respectively substituted in odd samples of zero vectors of a length Ns 
55 (= 52) as new V1 , V2 and V3 to the adding section 1 27 when the value of the third bit is "1 ." 

The adding section 127 adds the three vectors (V1, V2 and V3) produced by the interpolating section 126 to gen- 
erate an added excitation vector. 
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According to this mode, as apparent from the above, a plurality of processes are combined at random in accord- 
ance with the added -excitation-vector number to produce random excitation vectors, so that it is unnecessary to store 
random code vectors as they are in a random codebook (ROM), ensuring a significant reduction in memory capacity. 

Note that the use of the excitation vector generator of this mode in the speech coder of the fifth mode can allow 
5 complicated and random excitation vectors to be generated without using a large-capacity random codebook. 

(Seventh Mode) 

A description will now be given of a seventh mode in which the excitation vector generator of any one of the above- 
w described first to sixth modes is used in a CELP type speech coder that is based on the PSI-CELP. the standard speech 
coding/decoding system for PDC digital portable telephones in Japan. 

FIG. 13A is presents a block diagram of a speech coder according to the seventh mode. In this speech coder, digital 
input speech data 1300 is supplied to a buffer 1301 frame by frame (frame length Nf = 104). At this time, old data in the 
buffer 1301 is updated with new data supplied. A frame power quantizing/decoding section 1302 first reads a process- 
is ing frame s(i) (0 s i £ Nf-1) of a length Nf (= 104) from the buffer 1301 and acquires mean power amp of samples in that 
processing frame from an equation 5. 



20 



amp = 



Nf 

Nf 



(5) 



where amp: mean power of samples in a processing frame 
25 i: element number (0 ^ i ^ Nf-1) in the processing frame 
s(i): samples in the process ng frame 
Nf : processing frame length (= 52). 



The acquired mean power amp of samples in the processing frame is converted to a logarithmically converted 
30 value amplog from an equation 6. 



35 



log 10 (255xamp + 1) 

log 1Q (255 + 1) (6) 

where amplog: logarithmically converted value of the mean power of samples in the processing frame 
amp: mean power of samples in the processing frame. 

The acquired amplog is subjected to scalar quantization using a scalar-quantization table Cpow of 10 words as 
40 shown in Table 3 stored in a power quantization table storage section 1303 to acquire an index of power Ipow of four 
bits, decoded frame power spow is obtained from the acquired index of power Ipow, and the index of power Ipow and 
decoded frame power spow are supplied to a parameter coding section 1331. The power quantization table storage 
section 1303 is holding a power scalar-quantization table (Table 3) of 16 words, which is referred to when the frame 
power quantizing/decoding section 1302 carries out scalar quantization of the logarithmically converted value ol the 
45 mean power of the samples in the processing frame. 



Table 3 



50 



Power scalar-quantization table 


i 


Cpow(i) 


i 


Cpow(i) 


1 


0.00675 


9 


0.39247 


2 


0.06217 


10 


0.42920 


3 


0.10877 


11 


0.46252 


4 


0.16637 


12 


0.49503 


5 


0.21876 


13 


0.52784 



10 
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Table 3 (continued) 



Power scalar-quantization table 


i 


Cpow(i) 


i 


Cpow(i) 


6 


0.26123 


14 


0.56484 


7 


0.30799 


15 


0.61125 


8 


0.35228 


16 


0.67498 



An LPC analyzing section 1304 first reads analysis segment data of an analysis segment length Nw (= 256) from 
the buffer 1 30 1 . multiplies the read analysis segment data by a Hamming window of a window length Nw (= 256) to yield 
a Hamming windowed analysis data and acquires the autocorrelation function of the obtained Hamming windowed 
analysis data to a prediction order Np (= 1 0). The obtained autocorrelation function is multiplied by a lag window table 
is (Table 4) of 10 words stored in a lag window storage section 1305 to acquire a Hamming windowed autocorrelation 
function, performs linear predictive analysis on the obtained Hamming windowed autocorrelation function to compute 
an LPC parameter o(i) (1 ^ i s Np) and outputs the parameter to a pitch pre-selector 1 308. 



Table 4 



25 



Lag window table 




Wlag(i) 


i 


Wlag(i) 


0 


0.9994438 


5 


0.9801714 


1 


0.9977772 


6 


0.9731081 


2 


0.9950056 


7 


0.9650213 


3 


0.9911382 


8 


0.9559375 


4 


0.9861880 


9 


0.9458861 



Next, the obtained LPC parameter a(i) is converted to an LSP (Linear Spectrum Pair) o(i) (1 * i ^ Np) which is in 
turn output to an LSP quantizing/decoding section 1306. The lag window storage section 1305 is holding a lag window 

35 table to which the LPC analyzing section refers. 

The LSP quantizing/decoding section 1306 first refers to a vector quantization table of an LSP stored in a LSP 
quantization table storage section 1307 to perform vector quantization on the LSP received from the LPC analyzing 
section 1304, thereby selecting an optimal index, and sends the selected index as an LSP code lisp to the parameter 
coding section 1331 . Then, a centroid corresponding to the LSP code is read as a decoded LSP ooq(i) (1 * i * Np) from 

40 the LSP quantization table storage section 1307, and the read decoded LSP is sent to an LSP interpolation section 
131 1. Further, the decoded LSP is converted to an LPC to acquire a decoded LSP aq(i) (1 * i * n P ) ( which is in turn 
sent to a spectral weighting filter coefficients calculator 1 31 2 and a perceptual weighted LPC synthesis filter coefficients 
calculator 1314. The LSP quantization table storage section 1307 is holding an LSP vector quantization table to which 
the LSP quantizing/decoding section 1306 refers when performing vector quantization on an LSP. 

45 The pitch pre-selector 1308 first subjects the processing frame data s(i) (0 s i s Nf-1) read from the buffer 1301 to 
inverse filtering using the LPC a (i) (1 * i s Np) received from the LPC analyzing section 1304 to obtain a linear predic- 
tive residual signal res(i) (0 ^ i * Nf-1), computes the power of the obtained linear predictive residual signal res(i). 
acquires a normalized predictive residual power resid resulting from normalization of the power of the computed resid- 
ual signal with the power of speech samples of a processing subframe, and sends the normalized predictive residual 

so power to the parameter coding section 1 33 1 . Next, the linear predictive residual signal res(i) is multiplied by a Hamming 
window of a length Nw (= 256) to produce a Hamming windowed linear predictive residual signal resw(i) (0 s i s Nw-1). 
and an autocorrelation function 4>int(i) of the produced resw(i) is obtained over a range of Lmin-2 3 i £ Lmax+2 (where 
Lmin is 16 in the shortest analysis segment of a long predictive coefficient and Lmax is 128 in the longest analysis seg- 
ment of a long predictive coefficient). A polyphase filter coefficient Cpp1 (Table 5) of 28 words stored in a polyphase 

55 coefficients storage section 1 309 is convoluted in the obtained autocorrelation function <|>int(i) to acquire an autocorre- 
lation function $dq(i) at a fractional position shifted by -1/4 from an integer lag int. an autocorrelation function ^aq(i) at 
a fractional position shifted by +1/4 from the integer lag int. and an autocorrelation function 4>ah(i) at a fractional position 
shifted by +1/2 from the integer lag int 
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Tables 



Polyphase filter coefficients Cppf 


i 


Cppffl) 




Cppf(i) 


i 


CppfP) 




Cppf(i) 


0 


0.100035 


7 


0.000000 


14 


-0.128617 


21 


-0.212207 


t 


•0.180063 


8 


0.000000 


15 


0.300105 


22 


0.636620 


2 


0.900316 


9 


1.000000 


16 


0.900316 


23 


0.636620 


3 


0.300105 


10 


0.000000 


17 


-0.180063 


24 


-0.212207 


4 


-0.128617 


11 


0.000000 


18 


0.100035 


25 


0.127324 


5 


0.081847 


12 


0.000000 


19 


-0.069255 


26 


-0.090946 


6 


•0.060021 


13 


0.000000 


20 


0.052960 


27 


0.070736 



Further, for each argument i in a range of Lmin-2 5 i * Lmax+2, a process of an equation 7 of substituting the larg- 
est one of <[>int(i), 4dq(i). <t> aq(0 and <j>ah(i) in $max(i) to acquire (Lmax - Lmin + 1) pieces of <|>max(i). 

20 

<j> max(i)=:M AX(((» int(i), (frdqtf), 4>aq(i) , <t>ah(t)) (7) 
$ max(i) : maximum value of $int(i),$dq(i),<t>aq(i),<(>ah(i) 

25 where <i>max(i): the maximum value among <|>int{i), <j> dq(i), $aq(i), <frah(i) 

I: analysis segment of a long predictive coefficient (Lmin * i * Lmax) 

Lmin: shortest analysis segment (= 1 6) of the long predictive coefficient 

Lmax: longest analysis segment (= 128) of the long predictive coefficient 

4>int(i): autocorrelation function of an integer lag (int) of a predictive residual signal 
30 <|>dq(i): autocorrelation function of a fractional lag (int-1/4) of the predictive residual signal 

<fraq(i): autocorrelation function of a fractional lag (int+1/4) of the predictive residual signal 

4ah(i): autocorrelation function of a fractional lag (int+1/2) of the predictive residual signal. 

Larger top six are selected from the acquire (Lmax - Lmin + 1) pieces of <|>max(i) and are saved as pitch candidates 
35 psel(i) (0 s i s 5), and the linear predictive residual signal res(i) and the first pitch candidate psel(0) are sent to a pitch 
weighting filter calculator 1310 and psel(i) (0 s i * 5) to an adaptive code vector generator 1319. 

The polyphase coefficients storage section 1309 is holding polyphase filter coefficients to be referred to when the 
pitch pre-selector 1308 acquires the autocorrelation of the linear predictive residual signal to a fractional lag precision 
and when the adaptive code vector generator 1319 produces adaptive code vectors to a fractional precision. 
40 The pitch weighting filter calculator 1310 acquires pitch predictive coefficients cov(i) (0 s i ^ 2) of a third order from 
the linear predictive residuals res(i) and the first pitch cancfidate psel(0) obtained by the pitch pre-selector 1308. The 
impulse response of a pitch weighting fBter G(z) is obtained from an equation which uses the acquired pitch predictive 
coefficients cov(i) (0 ^ i £ 2), and is sent to the spectral weighting filter coefficients calculator 1312 and a perceptual 
weighting filter coefficients calculator 1313. 

45 

2 

G(z)=1 + £ cov(i) x Xpi x z-pseJ(0) + i-1 (8) 
i-o 

50 

where Q(z): transfer function of the pitch weighting filter 
cov(i): pitch predictive coefficients (0 s i s 2) 
Xpi: pitch weighting constant (= 0.4) 
psel(O): first pitch candidate. 

55 

The LSP interpolation section 1311 first acquires a decoded interpolated LSP <ointp(n,i) (1 * i * Np) subframe by 
subframe from an equation 9 which uses a decoded LSP coq(i) for the current processing frame, obtained by the LSP 
quantizing/decoding section 1 306, and a decoded LSP <ocp(i) for a previous processing frame which has been acquired 
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and saved earlier. 



a>intp(n,i) 



'0.4 x wq(i) + 0.6 x oxjp(i) n - 1 
oxi(i) n - 2 



(9) 



10 

where <ointp(n,j): interpolated LSP of the n-th subframe 

n: subframe number (= 1 ,2) 

<oq(i): decoded LSP of a processing frame 

coqp(i): decoded LSP of a previous processing frame. 

15 

A decoded interpolated LPC aq(n,i) (1 * i 5 Np) is obtained by converting the acquired ©intp(n,i) to an LPC and the 
acquired, decoded interpolated LPC a q(n,i) (1 s i s Np) is sent to the spectral weighting filter coefficients calculator 
1312 and the perceptual weighted LPC synthesis filter coefficients calculator 1314. 

The spectral weighting filter coefficients calculator 1312, which constitutes an MA type spectral weighting filter l(z) 
20 in an equation 1 0, sends its impulse response to the perceptual weighting filter coefficients calculator 1313. 

Nfir 

l(z)=£ afir(i)xz"' (10) 
i=1 

25 

where l(z): transfer function of the MA type spectral weighting filter 

Nfir: filter order (=11) of l(z) 

afir(i): filter order (1 ^ j s Nfir) of l(z). 

30 

Note that the impulse response ocfir(i) (1 s i ^ Nlir) in the equation 10 is an impulse response of an ARMA type 
spectral weighting filter G(z), given by an equation 1 1 , cut after Nfir(= 11). 

35 1 + £ afn.OxXma'xz" 1 

G ( Z > = T — < 11 > 

1+£ a(n,i)x>.ar i xz" i 

1 = 1 

40 

where G(z): transfer function of the spectral weighting filter 
n: subframe number (= 1 ,2) 
Np: LPC analysis order (= 1 0) 
a(n,i): decoded interpolated LSP of the n*th subframe 
45 Xma: numerator constant (= 0.9) of G(z) 
Xar: denominator constant (= 0.4) of G(z). 

The perceptual weighting filter coefficients calculator 1313 first constitutes a perceptual weighting filter W(z) which 
has as an impulse response the result of convolution of the impulse response of the spectral weighting filter l(z) 
so received from the spectral weighting filter coefficients calculator 1312 and the impulse response of the pitch weighting 
filter Q(z) received from the pitch weighting filter calculator 1310, and sends the impulse response of the constituted 
perceptual weighting fitter W(z) to the perceptual weighted LPC synthesis filter coefficients calculator 1314 and a per- 
ceptual weighting section 1315. 

The perceptual weighted LPC synthesis filter coefficients calculator 1314 constitutes a perceptual weighted LPC 
55 synthesis filter H(z) from an equation 12 based on the decoded interpolated LPC aq(n,i) received from the LSP inter- 
polation section 131 1 and the perceptual weighting filler W(z) received from the perceptual weighting filter coefficients 
calculator 1313. 
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H ( z >= "uH ; w < z > 02) 

1 + £ aq(n,i)xz" 
i» 1 

5 

where H(z): transfer function of the perceptual weighled synthesis f Bter 
Np: LPC analysis order 

aq(n,i): decoded interpolated LPC of the n-th subframe 
n: subframe number (= 1 ,2) 
io W(z): transfer function of the perceptual weighting filter (l(z) and Q(z) cascade-connected). 

The coefficient of the constituted perceptual weighted LPC synthesis filter H(z) is sent to a target vector generator 
A 1316, a perceptual weighted LPC reverse synthesis filter A 1317, a perceptual weighted LPC synthesis filter A 1321, 
a perceptual weighted LPC reverse synthesis filter B 1326 and a perceptual weighted LPC synthesis filter B 1329. 

is The perceptual weighting section 1315 inputs a subframe signal read from the buffer 1301 to the perceptual 
weighted LPC synthesis filter H(z) in a zero state, and sends its outputs as perceptual weighted residuals spw(i) (0 * j 
3 Ns-1) to the target vector generator A 1316. 

The target vector generator A 1316 subtracts a zero input response Zres(i) (0 ^ i * Ns-1), which is an output when 
a zero sequence is input to the perceptual weighted LPC synthesis filter H(z) obtained by the perceptual weighted LPC 

20 synthesis filter coefficients calculator 1314, from the perceptual weighted residuals spw(i) (0 ^ i * Ns-1) obtained by the 
perceptual weighting section 1315, and sends the subtraction result to the perceptual weighted LPC reverse synthesis 
fater A 131 7 and a target vector generator B 1325 as a target vector r(i) (0 * i * Ns-1) for selecting an excitation vector. 

The perceptual weighted LPC reverse synthesis filter A 1317 sorts the target vectors r(i) (0 * i * Ns-1) received 
from the target vector generator A 1316 in a time reverse order, inputs the acquired vectors to the perceptual weighted 

25 LPC synthesis filter H(z) with the initial state of zero, and sorts its outputs again in a time reverse order to obtain time 
reverse synthesis rh(k) (0 * i * Ns-1) of the target vector, and sends the vector to a comparator A 1322. 

Stored in an adaptive codebook 1 318 are old excitation vectors which are referred to when the adaptive code vector 
generator 1319 generates adaptive code vectors. The adaptive code vector generator 1319 generates Nac pieces of 
adaptive code vectors Pacb(i.k) (0 ^ j * Nac-1, 0 * k * ^ Ns-1 , 6 * Nac * 24) based on six pitch candidates pseifi) (0 

30 * j * 5) received from the pitch pre-selector 1308, and sends the vectors to an adaptive/fixed selector 1320. Specifically, 
as shown in Table 6, adaptive code vectors are generated for four kinds of fractional lag positions per a single integer 
lag position when 16 * pselQ) * 44, adaptive code vectors are generated for two kinds of fractional lag positions per a 
single integer lag position when 46 * psel(j) * 64, and adaptive code vectors are generated for integer lag positions 
when 65 ^ pselfj) ^ 128. From this, depending on the value of pselQ) (0 ^ j ^ 5), the number of adaptive code vector 

35 candidates Nac is 6 at a minimum and 24 at a maximum. 



Table 6 



Total number of adaptive code vectors and fixed code vectors 


Total number of vectors 


255 






Number of adaptive code vectors 
16<;psel(i)*44 
45<;psel(i)<;64 
65 £ pselfj) * 128 


222 

1 16 (29 x four kinds of fractional lags) 
42 (21 x two kinds of fractional lags) 
64 (64 x one kind of fractional lag) 


Number of fixed code vectors 


32(1 6x two kinds of codes) 



Adaptive code vectors to a fractional precision are generated through an interpolation which convolutes the coeffi- 
cients of the polyphase filter stored in the polyphase coefficients storage section 1309. 

Interpolation corresponding to the value of lagf(i) means interpolation corresponding to an integer lag position 
55 when lagf(i) = 0, interpolation corresponding to a fractional lag position shifted by -1/2 from an integer lag position when 
lagf(i) = 1, interpolation corresponding to a fractional lag position shifted by +1/4 from an integer lag position when 
lagrfl) = 2, and interpolation corresponding to a fractional lag position shifted by -1/4 from an integer lag position -Mien 
lagfffl = 3. 
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The adaptivertixed selector 1320 first receives adaptive cede vectors of the Nac (6 to 2*) candidates generated by 
the adaptive code vector generator 1 31 9 and sends the vectors to the perceptual weighted LPC synthesis filter A 1321 
and the comparator A 1 322. 

To pre-select the adaptive code vectors Pacb(i.k) (0 * i * Nac-1. 0 s k * Ns-1. 6 * Nac * 24) generated by the 
s adaptive code vector generator 1319 to Nacb (= 4) candidates from Nac (6 to 24) candidates, the comparator A 1322 
first acquires the inner products prac(i) of the time reverse synthesized vectors rh(k) (0 3 i * Ns-1) of the target vector, 
received from the perceptual weighted LPC reverse synthesis filter A 1317. and the adaptive code vectors Pacb(i.k) 
from an equation 13. 

10 Ns-1 

prac(i)= X Pacb(i,k)xrh(k) (13) 

k-0 



is where Prac(i) : reference value for pre-selection of adaptive code vectors 

Nac: the number of adaptive code vector candidates after pre-selection (= 6 to 24) 

i: number of an adaptive code vector (0 * i £ Nac-1) 

Pacb(i.k): adaptive code vector 

rh(k): time reverse synthesis of the target vector r(k). 

20 

By comparing the obtained inner products Prac(i), the top Nacp (= 4) indices when the values of the products 
become large and inner products with the indices used as arguments are selected and are respectively saved as indi- 
ces of adaptive code vectors after pre-selection apsel(j) (0 ^ j ^ Nacb-1) and reference values after pre-selection of 
adaptive code vectors prac(apselG)), and the indices of adaptive code vectors after pre-selection apselQ) (0 * j * Nacb- 

25 1 ) are output to the adaptive/fixed selector 1 320. 

The perceptual weighted LPC synthesis filter A 1321 performs perceptual weighted LPC synthesis on adaptive 
code vectors after pre-selection Pacb(absel(j),k), which have been generated by the adaptive code vector generator 
1319 and have passed the adaptive/fixed selector 1320, to generate synthesized adaptive code vectors SYN- 
acb(apsel(j).k) which are in turn sent to the comparator A 1 322. Then, the comparator A 1322 acquires reference values 

30 for final-selection of an adaptive code vector sacbr(j) from an equation 1 4 for final-selection on the Nacb (= 4) adaptive 
code vectors after pre-selection Pacb(absel(j),k), pre-selected by the comparator A 1322 itself. 

sacbrQ) = N6 P r f c2(aPSel(D) 0*> 
35 £ SYNacb 2 Q,k) 

K-0 



where sacbrQ): reference value for final-selection of an adaptive code vector 
40 prac(): reference values after pre-selection of adaptive code vectors 
apselQ): indices of adaptive code vectors after pre-selection 
k: vector order (0 ^ j * Ns-1) 

j: number of the index of a pre-selected adaptive code vector (0 s j ^ Nacb-1) 
Ns: subframe length (= 52) 
45 Nacb: the number of pre-selected adaptive code vectors (= 4) 
SYNacb(J,K): synthesized adaptive code vectors. 

The index when the value of the equation 14 becomes large and the value of the equation 14 with the index used 
as an argument are sent to the adaptive/fixed selector 1 320 respectively as an index of adaptive code vector after final- 
50 selection ASEL and a reference value after final-selection of an adaptive code vector sacbr(ASEL). 

A fixed codebook 1323 holds Nfc (= 16) candidates of vectors to be read by a fixed code vector reading section 
1324. To pre-select fixed code vectors Pfcbp.k) (0 3 i s Nfc-1 , 0 5 k 5 Ns-1) read by the fixed code vector reading sec- 
tion 1324 to Nfcb (= 2) candidates from Nfc (= 16) candidates, the comparator A 1322 acquires the absolute values 
|prfc(i)| of the inner products of the time reverse synthesized vectors rh(k) (0 * i * Ns-1 ) of the target vector, received 
55 from the perceptual weighted LPC reverse synthesis filter A 131 7, and the fixed code vectors Pfcb(i,k) from an equation 
15. 
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|prfc(i)| = £ Pfcb(i.k)xrhM 

k»0 



(15) 



5 where |prfc(i)|: reference values for pre-selection of fixed code vectors 

k: element number of a vector (0 * k * Ns-1 ) 

i: number of a fixed code vector (0 £ i £ Nfc-1) 

Nfc: the number of fixed code vectors (= 16) 

Pf cb(i.k) : fixed code vectors 
w m(k): time reverse synthesized vectors of the target vector rh(k). 



By comparing the values |prfc(i)l of the equation 15, the top Nfcb (= 2) indices when the values become large and 
the absolute values of inner products with the indices used as arguments are selected and are respectively saved as 
indices of fixed code vectors after pre-selection fpselQ) (0 * j * Nfcb-1 ) and reference values for fixed code vectors after 
is pre-selection |prfc(fpselQ)|, and incfices of fixed code vectors after pre-selection fpselQ) (0 * j * Nfcb-1 ) are output to the 
adaptive/fixed selector 1320. 

The perceptual weighted LPC synthesis filter A 1321 performs perceptual weighted LPC synthesis on fixed code 
vectors after pre-selection Pfcb(fpsel(j) l k) which have been read from the fixed code vector reading section 1324 and 
have passed the adaptive/fixed selector 1320, to generate synthesized fixed code vectors SYNfcb(fpselQ).k) which are 
20 in turn sent to the comparator A 1 322. 

The comparator A 1322 further acquires a reference value for final-selection of a fixed code vector sfcbrO) f r ° m * n 
equation 16 to finally select an optimal fixed code vector from the Nfcb (= 2) fixed code vectors after pre-selection 
Pfcb(fpselG).k), pre-selected by the comparator A 1322 itself. 

sfcbr(j) = Ns ' P 1 rfC ^ e *^' — (16) 
£ SYNfcb 2 (j,k) 
k = 0 



where sfcbr(j): reference value for final-selection of a fixed code vector 
|prfc()|: reference values after pre-selection of fixed code vectors 
fpsel(j): incfices of fixed code vectors after pre-selection (0 ^ j ^ Nfcb-1) 
k: element number of a vector (0 * k ^ Ns-1) 
35 j: number of a pre-selected fixed code vector (0 s j % Nfcb-1) 
Ns: subframe length (= 52) 

Nfcb: the number of pre-selected fixed code vectors (= 2) 
SYNfcb(J.K): synthesized fixed code vectors. 



45 



The index when the value of the equation 16 becomes large and the value of the equation 16 with the index used 
as an argument are sent to the adaptive/fixed selector 1320 respectively as an index of fixed code vector after final- 
selection FSEL and a reference value after final-selection of a fixed code vector sacbr(FSEL). 

The adaptivertixed selector 1320 selects either the adaptive code vector after final-selection or the fixed code vec- 
tor after final-selection as an adaptiveAixed code vector AF(k) (0 * k £ Ns-1) in accordance with the size relation and 
the polarity relation among prac(ASEL), sacbr(ASEl), |prfc(FSEL)| and sfcbr(FSEL) (described in an equation 17) 
received from the comparator A 1 322. 
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AF(k) - 



Pacb(ASEL,k) 
0 

Pfcb(FSEL,k) 
-Pfcb(FSEL,k) 



sacbr(ASEL) * sfcbr(FSEL), prac(ASEL) > 0 
sacbr(ASEL) * sfcbr(FSEL),prac(ASEL) s 0 
sacbr(ASEL) < sfcbr(FSEL), prfc(FSEL) * 0 
sacbr(ASEL) < sfcbr(FSEL), prfc(FSEL) < 0 
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where AF(k): adaptive/fixed code vector 

ASEL: index cf adaptive code vector after final-selection 

FSEL: index cf fixed code vector after final-selection 

k: element number of a vector 
5 Pacb(ASEUk): adaptive code vector after final -selection 

Pfcb(FSEUk): fixed code vector after final-selection Pfcb(FSEL,h) 

sacbr(ASEL): reference value after final-selection ol an adaptive code vector 

sfcbr(FSEL): reference value after final-selection of a fixed code vector 

prac(ASEL): reference values after pre-selection of adaptive code vectors 
10 prfc(FSEL): reference values after pre-selection of fixed code vectors prfc(FSEL). 

The selected adaptive/fixed code vector AF(k) is sent to the perceptual weighted LPC synthesis filter A 1321 and 
an index representing the number that has generated the selected adaptive/fixed code vector AF(k) is sent as an adap- 
tive/fixed index AFSEL to the parameter coding section 1331 . As the totaJ number of adaptive code vectors and fixed 
is code vectors is designed to be 255 (see Table 6), the adaptive/fixed index AFSEL is a code of 8 bits. 

The perceptual weighted LPC synthesis filter A 1321 performs perceptual weighted LPC synthesis on the adap- 
tive/fixed code vector AF(k), selected by the adaptive* ixed selector 1 320, to generate a synthesized adaptiveAixed code 
vector SYNaf(k) (Osks Ns-1) and sends it to the comparator A 1322. 

The comparator A 1322 first obtains the power powp of the synthesized adaptfve/fixed code vector SYNaf(k) (0 * 
20 k ^ Ns-1) received from the perceptual weighted LPC synthesis filter A 1321 using an equation 18. 

Nft-1 

powp = £ SYNaf 2 (k) (18) 

25 

where powm: power of adaptive/fixed code vector (SYNaf(k)) 
k: element number of a vector (0 ^ k * Ns-1 ) 
Ns: subframe length (= 52) 
so SYNaf(k): adaptiveAixed code vector. 

Then, the inner product pr of the target vector received from the target vector generator A 131 6 and the synthesized 
adaptive/fixed code vector SYNaf (k) is acquired from an equation 1 9. 

35 N»-1 

pr= £ SYNaf(k)xr(k) . (19) 

k-0 



40 where pr: inner product of SYNaf (k) and r(k) 
Ns: subframe length (= 52) 
SYNaf (k): adaptrvertixed code vector 
r(k): target vector 

k: element number of a vector (0 * k * Ns-1). 

45 

Further, the adaptive/lixed code vector AF(k) received from the adaptive/fixed selector 1320 is sent to an adaptive 
codebook updating section 1333 to compute the power POWaf of AF(k). the synthesized adaptive/fixed code vector 
SYNaf(k) and POWaf are sent to the parameter cocfing section 1331, and powp, pr, r(k) and rh(k) are sent to a compa- 
rator B 1330. 

so The target vector generator B 1 325 subtracts the synthesized adaptive* ixed code vector SYNaf(k), received from 
the comparator A 1322, from the target vector r(i) (0 s i s Ns-1) received from the comparator A 1322, to generate a 
new target vector, and sends the new target vector to the perceptual weighted LPC reverse synthesis filter B 1326. 

The perceptual weighted LPC reverse synthesis filter B 1326 sorts the new target vectors, generated by the target 
vector generator B 1325, in a time reverse order, sends the sorted vectors to the perceptual weighted LPC synthesis 
55 filter in a zero state, the output vectors are sorted again in a time reverse order to generate time-reversed synthesized 
vectors ph(k) (0 s k * Ns-1) which are in turn sent to the comparator B 1330. 

An excitation vector generator 1337 in use is the same as. for example, the excitation vector generator 70 which 
has been described in the section of the third mode. The excitation vector generator 70 generates a random code vector 
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as the first seed is read from the seed storage section 71 and input to the non-linear digital filter 72. The random code 
vector generated by the excitation vector generator 70 is sent to the perceptual weighted LPC synthesis filter B 1329 
and the comparator B 1330. Then, as the second seed is read from the seed storage section 71 and input to the non- 
linear digital filter 72, a random code vector is generated and output to the filter B 1329 and the comparator B 1330. 
5 To pre-select random code vectors generated based on the first seed to Nstb (= 6) candidates from Nst (= 64) can- 
didates, the comparator B 1 330 acquires reference values cr(i 1 ) (0 * i 1 ^ Mstbl -1 ) for pre-selection of first random code 
vectors from an equation 20. 

w cr(i1)= £ Pstb1(i1j)xrhQ)--^- £ Pstb1(i1|>ph0) (20) 



where a(i1): reference values for pre-seiection of first random code vectors 
is Ns: subframe length (=> 52) 

rh(j): time reverse synthesized vector of a target vector (r(j)) 

powp: power of an adaptive/fixed vector (SYNaf (k)) 

pr: inner product of SYNaf (k) and r(k) 

Pstbl (i 1 .0 • * ,r st random code vector 
20 ph(j): time reverse synthesized vector of SYNaf(k) 

ii : number of the first random code vector (0 ^ ii % NsM) 

j: element number of a vector. 

By comparing the obtained values cr(i1), the top Nstb (= 6) indices when the values become large and inner prod- 
25 ucts with the indices used as arguments are selected and are respectively saved as indices of first random code vectors 
after pre-selection slpsefQl) (0 * p s Nstb-1) and first random code vectors after pre-selection Pstbl (s1pse!Qi),k) (0 * 
j 1 * Nstb-1 , Oiki Ns-1 ). Then, the same process as done for the first random code vectors is performed for second 
random code vectors and indices and inner products are respectively saved as indices of second random code vectors 
after pre-selection slpse!G2) (0 ^ j2 * Nstb-1) and second random code vectors after pre-selection Pstb2(s2pselG2),k) 
30 (0 s j2 5 Nstb-1. 0 * k * Ns-1). 

The perceptual weighted LPC synthesis fitter B 1329 performs perceptual weighted LPC synthesis on the first ran- 
dom code vectors after pre-selection Pstbl(slpselQ1).k) to generate synthesized first random code vectors 
SYNstb1(s1pselG1).k) which are in turn sent to the comparator B 1330. Then, perceptual weighted LPC synthesis is 
performed on the second random code vectors after preselection Pstb2(s1pselG2),k) to generate synthesized second 
35 random code vectors SYNstb2(s2psel(j2),k) which are in turn sent to the comparator B 1330. 

To implement final-selection on the first random code vectors after pre-selection Pstb 1 (s 1 pselfj 1 ) . k) and the sec- 
ond random code vectors after pre-selection Pstb2(slpselQ2),k), pre-selected by the comparator B 1330 itself, the com- 
parator B 1330 carries out the computation of an equation 21 on the synthesized first random code vectors 
SYNstb1(s1psel{j1),k) computed in the perceptual weighted LPC synthesis filter B 1329. 

40 

SYNOstb1(s1pselG1),k)=SYNstb1(s1pselQ1),k) - £ Pstbl (s1pseJG1),k)xph(k) (21) 



45 

where SYNOstb1(s1pselQ1),k): orthogonally synthesized first random code vector 
SYNstb1(s1pselG1),k): synthesized first random code vector 
Pstb1(s1pselQ1),k): first random code vector after pre-selection 
SYNafffl: adaptive/fixed code vector 
so powp: power of adaptive/fixed code vector ( SYNaf G)) 
Ns: subframe length (= 52) 
ph(k): time reverse synthesized vector of SYNaf© 
j1 : number of first random code vector after pre-selection 
k: element number of a vector (0 ^ k * Ns-1 ). 



55 



Orthogonally synthesized first random code vectors SYNOstbl(slpselG1).k) are obtained, and a similar computa- 
tion is performed on the synthesized second random code vectors SYNstb2(s2pselG2).k) to acquire orthogonally syn- 
thesized second random code vectors SYNOstb2(s2psel(j2),k), and reference values after final-selection of a first 



18 



10 



EP 0 883 107 A1 

random code vector sla and reference values after final-selection of a second random code vector s2cr are computed 
in a closed loop respectively using equations 22 and 23 for all the combinations (36 combinations) of (sIpseJQI). 
s2pse!02)). 

"'-fin — (22) 

£ (SYNOstb1(s1pseI01),k)+SYNOstb2(s2psel(j2),k)J 2 
k = 0 



where scrl: reference value after final-selection of a first random code vector 
cscrl : constant previously computed from an equation 24 
SYNOstb1(s1pseI(j1),k): orthogonally synthesized first random code vectors 
SYNOstb2(s2pselG2),k): orthogonally synthesized second random code vectors 
is r(k): target vector 

s1psel(jl): index of first random code vector after pre-selection 
s2psel(j2): index of second random code vector after pre-selection 
Ns: subframe length (= 52) 
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k: element number of a vector 



Scr2 = N — 2S*f (23 ) 

25 £ [SYNOstbi(slpse101),k)-SYNOstb2(s2pselG2) l k)] 2 

k = 0 



where scr2: reference value after final -selection of a second random code vector 
30 cscr2; constant previously computed from an equation 

SYNOstbl(slpsel(jl),k): orthogonally synthesized first random code vectors 
SYNOstb2(s2psel(j2),k): orthogonally synthesized second random code vectors 
r(k): target vector 

s1psel(j1): index of first random code vector after pre-selection 
35 s2psel(j2): index of second random code vector after pre-selection 
Ns: subframe length (= 52) 
k: element number of a vector. 

Note that cslcr in the equation 22 and cs2cr in the equation 23 are constants which have been calculated previ- 
40 ousty using the equations 24 and 25, respectively. 

Ns-1 Ns-1 

cscrl = £ SYNOstb1(s1pselQ1) ( k)xr(k)f- £ SYNOstb2(s2pselQ2).k)xr(k) (24) 

fcrO K«0 

45 

where cscrl : constant for an equation 29 

SYNOstb1(s1psel(j1),k): orthogonally synthesized first random code vectors 
SYNOstb2(s2psel(j2).k): orthogonally synthesized second random code vectors 
so r(k): target vector 

s1psel(j1): index of first random code vector after pre-selection 
s2pselG2): index of second random code vector after pre-selection 
Ns: subframe length (= 52) 
k: element number of a vector. 

55 
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Ns-1 Ns-1 

csor1 = J] SYN0stb1(s1pselQ1).k)xr(k)- £ SYNOsto2(s2pse!Q2).k)xr(k) (25) 
where cscr2: constant tor the equation 23 

SYNOstb1(s1psel(jl),k): orthogonally synthesized first random code vectors 
SYNOstb2(s2pse!Q2),k): orthogonally synthesized second random code vectors 
r(k): target vector 

s1psel(j1): index of first random code vector after ore-selection 
s2psel(j2): index of second random code vector after pre-selection 
Ns: subframe length (= 52) 
k: element number of a vector. 



The comparator B 1330 substitutes the maximum value of S1 cr in MAXsi cr, substitutes the maximum value of S2cr 
in MAXs2cr, sets MAXslcr or MAXs2cr, whichever is larger, as scr, and sends the value of sIpselQI), which had been 
referred to when scr was obtained, to the parameter coding section 1331 as an index of a first random code vector after 
final-selection SSEL1 . The random code vector that corresponds to SSEL1 is saved as a first random code vector after 
final-selection Pstb1(SSEL1 ,k) , and is sent to the parameter coding section 1331 to acquire a first random code vector 
after final-selection SYNstb1(SSEL1 ,k) (0 * k * Ns-1) corresponding to Pstbl (SSEL1 ,k). 

Likewise, the value of s2psel(j2), which had been referred to when scr was obtained, to the parameter coding sec- 
tion 1331 as an index of a second random code vector after final-selection SSEL2. The random code vector that corre- 
sponds to SSEL2 is saved as a second random code vector after final-selection Pstb2(SSEL2,k), and is sent to the 
parameter coding section 1331 to acquire a second random code vector after final-selection SYNstb2(SSEL2,k) (0 ^ k 
* Ns-1) corresponding to Pstb2(SSEL2.k). 

The comparator B 1330 further acquires codes S1 and S2 by which Pstbl (SSEU.k) and Pstb2(SSEL2,k) are 
respectively multiplied, from an equation 26, and sends polarity information tels2 of the obtained S1 and S2 to the 
parameter coding section 1331 as a gain polarity index Isls2 (2-bit information). 

scrlascr2,cscrla0 
scrlascr2,cscrl < 0 

( 26 ) 

scrl < scr2,cscr2 * 0 
scrl < scr2,cscr2 < 0 



(S1 ( S2)- 



(+UD 
(-1,-1) 
(+1,-1) 
(-UD 



where S1 : code of the first random code vector after final-selection 
S2: code of the second random code vector after final-selection 
scrl : output of the equation 29 
scr2: output of the equation 23 
cscrl : output of the equation 24 
cscr2: output of the equation 25. 

A random code vector ST(k) (0 * k * Ns-1) is generated by an equation 27 and output to the adaptive codebook 
updating section 1333, and its power POWsf is acquired and output to the parameter coding section 1331. 

ST(k)=S1xPstb1(SSEL1.kKS2xPstb2(SSEL2,k) (27) 

where ST(k): probable code vector 
S1 : code of the first random code vector after linal-selection 
S2: code of the second random code vector after final-selection 
Pstbl (SSEU.k): first-stage settled code vector after final-selection 
Pstbl (SSEL2,k): second-stage settled code vector after final-selection 
SSEL1 : index of the first random code vector after final-selection 
SSEL2: second random code vector after final-selection 
k: element number of a vector (0 £ k * Ns-1). 



20 



EP 0 883 107 A1 



A synthesized random code vector SYNst(k) (0 ^ k * Ns*l) is generated by an equation 28 and output to the 
parameter coding section 1331. 

SYNst(k) = S1 xSYNstb1(SSEL1,k)+S2xSYNstb2(SSEL2.k) (28) 

where STNst(k): synthesized probable code vector 

S1 : code of the first random code vector after final-selection 

S2: code of the second random code vector after final-selection 

SYNstb1(SSEL1 ,k): synthesized first random code vector after final-selection 

SYNstb2(SSEL2,k): synthesized second random code vector after final -selection 

k: element number of a vector (0 * k * Ns-1). 

The parameter coding section 1 331 first acquires a residual power estimation fa each subframe rs is acquired from 
an equation 29 using the decoded frame power spow which has been obtained by the frame power quantizing/decoding 
section 1302 and the normalized predictive residual power resid, which has been obtained by the pitch pre-selector 
1308. 

rs=Ns* spow x resid (29) 

where rs: residual power estimation for each subframe 

Ns: subframe length (= 52) 

spow: decoded frame power 

resid: normalized predictive residual power. 

A reference value for quantization gain selection STDg is acquired from an equation 30 by using the acquired resid- 
ual power estimation for each subframe rs, the power of the adaptive/fixed code vector PCWaf computed in the compa- 
rator A 1322, the power of the random code vector POWst computed in the comparator B 1330, a gain quantization 
table (CGaf[i],CGst(i]) (0 * i ^ 127) of 256 words stored in a gain quantization table storage section 1332 and the like. 



Table 7 



Gain quantization table 


i 


CGaf(i) 


CGst(i) 


1 


0.38590 


0.23477 


2 


0.42380 


0.50453 


3 


0.23416 


0.24761 








126 


0.35382 


1.68987 


127 


0.10689 


1.02035 


128 


3.09711 


1.75430 



STDg = Q^ CGa!{lg)xSYNam + ^ 7ca *W)» sw »^^w) a (3()) 

Jt-0 



where STDg: reference value for quantization gain selection 
rs: residual power estimation for each subframe 
POWaf : power of the adaptive/fixed code vector 
POWSst: power of the random code vector 
i: index of the gain quantization table (0 * i * 127) 
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CGaf(i): component on the adaptive/fixed code vector side in the gain quantization table 
CGst(i): component on the random code vector side in the gain quantization table 
SYNaf(k): synthesized adaptive/fixed code vector 
SYNst(k): synthesized random code vector 
5 r(k): target vector 

Ns: subframe length (= 52) 

k: element number of a vector (0 3 k * Ns-1). 

One index when the acquired reference value for quantization gain selection STDg becomes minimum is selected 
io as a gain quantization index Ig, a final gain on the adaptive/fixed code vector side Gaf to be actually applied to AF(k) 
and a final gain on the random code vector side Gst to be actually applied to ST(k) are obtained from an equation 31 
using a gain after selection of the adaptive/fixed code vector CGaf(lg), which is read from the gain quantization table 
based on the selected gain quantization index Ig, a gain after selection of the random code vector CGst(lg), which is 
read from the gain quantization table based on the selected gain quantization index Ig and so forth, and are sent to the 
15 adaptive codebook updating section 1 333. 



w- Gst >llpzkf CGaf{l9) - Jmfct CCs « IG) ) (31) 

20 

where Gaf: final gain on the adaptiveytixed code vector side 
Gst: final gain on the random code vector side Gst 
rs: residual power estimation for each subframe 
POWaf: power of the adaptive/fixed code vector 
25 POWst: power of the random code vector 

CGaf(lg): power of a fixed/adaptive side code vector 
CGst(lg): gain after selection of a random code vector side 
Ig: gain quantization index. 

30 The parameter coding section 1331 converts the index of power Ipow, acquired by the frame power quantiz- 
ing/decoding section 1302, the LSP code Lisp, acquired by the LSP quantizing/decoding section 1306, the adap- 
tive/fixed index AFSEL, acquired by the adaptive/fixed selector 1320, the index of the first random code vector after 
final-selection SSEL1, the second random code vector after final-selection SSEL2 and the polarity information Is1s2, 
acquired by the comparator 8 1330, and the gain quantization index Ig, acquired by the parameter coding section 1 331 . 

35 into a speech code, which is in turn sent to a transmitter 1 334. 

The adaptive codebook updating section 1333 performs a process of an equation 32 for multiplying the adap- 
tive/fixed code vector AF(k). acquired by the comparator A 1322, and the random code vector ST(k), acquired by the 
comparator B 1330, respectively by the final gain on the adaptive/fixed code vector side Gaf and the final gain on the 
random code vector side Gst, acquired by the parameter coding section 1331, and then adding the results to thereby 

40 generate an excitation vector ex(k) (0 * k * Ns-1 ), and sends the generated excitation vector ex(k) (0 s k 3 Ns-1 ) to the ' 
adaptive codebook 1318. 

ex(k) = Gaf x AF[k) + Gst x ST{k) (32) 

45 where ex(k): excitation vector 

AF(k): adaptive/fixed code vector 

ST(k): random code vector 

k: element number of a vector (O^ks Ns-1 ). 

so At this time, an old excitation vector in the adaptive codebook 1318 is discarded and is updated with a new excita- 
tion vector ex(k) received from the adaptive codebook updating section 1333. 

(Eighth Mode) 

55 A description will now be given of an eighth mode in which any excitation vector generator described in first to sixth 
modes is used in a speech decoder that is based on the PSI-CELP, the standard speech coding/decocfing system for 
PDC digital portable telephones This decoder makes a pair with the above-described seventh mode, 

FIG. 14 presents a functional block dagram of a speech decoder according to the eighth mode. A parameter 
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decoding section 1402 obtains the speech code (the index of power Ipow. LSP code lisp, adaptive/fixed index AFSEL. 
index of the first random code vector after final-selection SSEL1. second random code vector after final-selection 
SSEL2, gain quantization index Ig and gain polarity index teJs2), sent from the CELP type speech coder illustrated in 
FIG. 13, via a transmitter 1401. 

5 Next, a scalar value indicated by the index of power Ipow is read from the power quantization table (see Table 3) 
stored in a power quantization table storage section 1405, is sent as decoded frame power spew to a power restoring 
section 1 41 7, and a vector indicated by the LSP code lisp is read from the LSP quantization table an LSP quantization 
table storage section 1 404 and is sent as a decoded LSP to an LSP interpolation section 1 406. The adaptive/fixed index 
AFSEL is sent to an adaptive code vector generator 1408. a fixed code vector reading section 1411 and an adap- 

w tiv&fixed selector 1 412, and the index of the first random code vector after final-selection SSEL1 and the second ran- 
dom code vector after finaJ-seiection SSEL2 are output to an excitation vector generator 1414. The vector (CAaf(lg). 
CGst(lg)) indicated by the gain quantization index Ig is read from the gain quantization table (see Table 7) stored in a 
gain quantization table storage section 1403, the final gain on the final gain on the adaptive/fixed code vector side Gaf 
to be actually applied to AF(k) and the final gain on the random code vector side Gst to be actually applied to ST(k) are 

is acquired from the equation 31 as done on the coder side, and the acquired final gain on the adaptive/fixed code vector 
side Gaf and final gain on the random code vector side Gst are output together with the gain polarity index Isls2 to an 
excitation vector generator 1413. 

The LSP interpolation section 1406 obtains a decoded interpolated LSP <ointp(n,i) (1 3 i s Np) subframe by sub- 
frame from the decoded LSP received from the parameter decoding section 1402, converts the obtained <ointp(n,i) to 

20 an LPC to acquire a decoded interpolated LPC, and sends the decoded interpolated LPC to an LPC synthesis filter 
1416. 

The adaptive code vector generator 1408 convolute some of polyphase coefficients stored in a polyphase coeffi- 
cients storage section 1 409 (see Table 5) on vectors read from an adaptive codebook 1 407, based on the adaptive/fixed 
index AFSEL received from the parameter decoding section 1402, thereby generating adaptive code vectors to a frac- 

25 tional precision, and sends the adaptive code vectors to the adaptive/fixed selector 1 41 2. The fixed code vector reading 
section 1 41 1 reads fixed code vectors from a fixed codebook 1 41 0 based on the adaptive/fixed index AFSEL received 
from the parameter decoding section 1402, and sends them to the adaptive/fixed selector 1412. 

The adaptive/lixed selector 1412 selects either the adaptive code vector input from the adaptive code vector gen- 
erator 1408 or the fixed code vector input from the fixed code vector reading section 1411, as the adaptive/fixed code 

30 vector AF(k), based on the adaptive/fixed index AFSEL received from the parameter decoding section 1402, and sends 
the selected adaptive/fixed code vector AF(k) to the excitation vector generator 1413. The excitation vector generator 
1414 acquires the first seed and second seed from the seed storage section 71 based on the index of the first random 
code vector after final-selection SSEL1 and the second random code vector after final-selection SSEL2 received from 
the parameter decoding section 1402. and sends the seeds to the non-linear digital filter 72 to generate the first random 

35 code vector and the second random code vector, respectively. Those reproduced first random code vector and second 
random code vector are respectively multiplied by the first-stage information S1 and second-stage information S2 of the 
gain polarity index to generate an excitation vector ST(k), which is sent to the excitation vector generator 1 41 3. 

The excitation vector generator 1413 multiplies the adaptive/fixed code vector AF(k), received from the adap- 
trvetf ixed selector 1 41 2, and the excitation vector ST(k), received from the excitation vector generator 1414, respectively 

40 by the final gain on the adaptive/fixed code vector side Gaf and the final gain on the random code vector side Gst, 
obtained by the parameter decoding section 1402, performs addition or subtraction based on the gain polarity index 
Isis2, yielding the excitation vector ex(k), and sends the obtained excitation vector to the excitation vector generator 
1413 and the adaptive codebook 1407. Here, an old excitation vector in the adaptive codebook 1407 is updated with a 
new excitation vector input from the excitation vector generator 1413. 

45 The LPC synthesis fater 141 6 performs LPC synthesis on the excitation vector, generated by the excitation vector 
generator 1413, using the synthesis filter which is constituted by the decoded interpolated LPC received from the LSP 
interpolation section 1406, and sends the filter output to the power restoring section 141 7. The power restoring section 
1 41 7 first obtains the mean power of the synthesized vector of the excitation vector obtained by the LPC synthesis filter 
1416, then divides the decoded frame power spow, received from the parameter decoding section 1402, by the 

so acquired mean power, and multiplies the synthesized vector of the excitation vector by the division result to generate a 
synthesized speech 518. 

(Ninth Mode) 

55 FIG. 1 5 is a block diagram of the essential portions of a speech coder according to a ninth mode. This speech coder 
has a quantization target LSP adding section 151 , an LSP quarrtizingflecoding section 152, a LSP quantization error 
comparator 153 added to the speech coder shown in FIGS. 13 or parts of its functions modified. 

The LPC analyzing section 1304 acquires an LPC by performing Snear predictive analysis on a processing frame 



23 




EP 0 883 107 A1 



in the buffer 1301, converts the acquired LPC to produce a quantization target LSP, and sends the produced quantiza- 
tion target LSP to the quantization target LSP adding section 1 51 . The LPC analyzing section 1 304 also has a particular 
function of performing linear predictive analysis on a pre-read area to acquire an LPC tor the pre-read area, converting 
the obtained LPC to an LSP for the pre-read area, and sending the LSP to the quantization target LSP adding section 
5 151. 

The quantization target LSP adding section 151 produces a plurality of quantization target LSPs in addition to the 
quantization target LSPs directly obtained by converting LPCs in a processing frame in the LPC analyzing section 1 304. 

The LSP quantization table storage section 1 307 stores the quantization table which is referred to by the LSP quan- 
tizing/decoding section 1 52, and the LSP quantizing/decoding section 1 52 quantizes/decodes the produced plurality of 
io quantization target LSPs to generate decoded LSPs. 

The LSP quantization error comparator 153 compares the produced decoded LSPs with one another to select, in 
a dosed loop, one decoded LSP which minimizes an allophone, and newly uses the selected decoded LSP as a 
decoded LSP for the processing frame 

FIG. 16 presents a block cfiagram of the quantization target LSP adding section 151 . 
is The quantization target LSP adding section 1 51 comprises a current frame LSP memory 1 61 for storing the quan- 
tization target LSP of the processing frame obtained by the LPC analyzing section 1304, a pre-read area LSP memory 
162 for storing the LSP of the pre-read area obtained by the LPC analyzing section 1304, a previous frame LSP mem- 
ory 1 63 for storing the decoded LSP of the previous processing frame, and a linear interpolation section 164 which per- 
forms linear interpolation on the LSPs read from those three memories to add a plurality of quantization target LSPs. 
20 A plurality of quantization target LSPs are additionally produced by performing linear interpolation on the quantiza- 
tion target LSP of the processing frame and the LSP of the pre-read, and produced quantization target LSPs are all sent 
to the LSP quantizingydecoding section 152. 

The quantization target LSP addng section 151 will now be explained more specifically. The LPC analyzing section 
1304 performs linear predictive analysis on the processing frame in the buffer to acquire an LPC a(i) (1 * i * Np) of a 
25 prediction order Np (= 10), converts the obtained LPC to generate a quantization target LSP co(i) (1 * i * Np), and stores 
the generated quantization target LSP ©(i) (1 ^ i § Np) in the current frame LSP memory 161 in the quantization target 
LSP adding section 151. Further, the LPC analyzing section 1304 performs linear predictive analysis on the pre-read 
area in the buffer to acquire an LPC for the pre-read area, converts the obtained LPC to generate a quantization target 
LSP <of(i) (1 * i s Np). and stores the generated quantization target LSP (1 s i s Np) for the pre-read area in the 
30 pre-read area LSP memory 1 62 in the quantization target LSP adding section 151. 

Next, the linear interpolation section 164 reads the quantization target LSP (1 s i s Np) for the processing 
frame from the current frame LSP memory 161 . the LSP af(i) (1 =5 i * Np) for the pre-read area from the pre-read area 
LSP memory 1 62, and decoded LSP ooqp(i) (1 s i s Np) for the previous processing frame from the previous frame LSP 
memory 163, and executes conversion shown by an equation 33 to respectively generate first additional quantization 
35 target LSP ©1(i) (1 s i * Np), second additional quantization target LSP co2(i) {1 s i s Np), and third additional quanti- 
zation target LSP ®1 (i) (1 s i s Np). 
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where ©1(i): first additional quantization target LSP 

©2(0 : second additional quantization target LSP 
45 ©3(i): third additional quantization target LSP 

i: LPC order (1 s i s Np) 

Np: LPC analysis order (=10) 

coq(i);decoded LSP for the processing frame 

©qp(i) decoded LSP for the previous processing frame 
so cof(i): LSP for the pre-read area. 

The generated col (i), <o2(i) and ©3{i) are sent to the LSP quantizing/decoding section 152. After performing vector 
quantization/decoding of all the four quantization target LSPs co(\), oo1(i), «2(i) and o3(i), the LSP quantizing/decoding 
section 1 52 acquires power Epow(co) of an quantization error for oofi), power Epow(©1 ) of an quantization error for ©1 (i), 
55 power Epow(<i>2) of an quantization error for ©2(i), and power Epow(<D3) of an quantization error for <o3(i), carries out 
conversion of an equation 34 on the obtained quantization error powers to acquire reference values STDIsp(o), 
STDIsp(o1). STDIsp(o 2) and ST0lsp(©3) for selection of a decoded LSP 
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(34) 



where STDIspfa): reference value for selection of a decoded LSP for o>(i) 
STDIsp(co1): reference value for selection of a decoded LSP for ©1 (i) 
STDIsp(co2): reference value for selection of a decoded LSP for G>2(i) 
STDIsp(<o3); reference value for selection of a decoded LSP for o>3(i) 
Epow{o>): quantization error power for o(i) 
Epow{<Bl): quantization error power for a>1 (i) 
Epow(ot)2): quantization error power for oo2(i) 
Epow(a>3): quantization error power for ©3(i). 

The acquired reference values for selection of a decoded LSP are compared with one another to select and output 
the decoded LSP for the quantization target LSP that becomes minimum as a decoded LSP<oq(i) (1 s i s Np) for the 
processing frame, and the decoded LSP is stored in the previous frame LSP memory 163 so that it can be referred to 
at the time of performing vector quantization of the LSP of the next frame. 

According to this mode, by effectively using the high interpolation characteristic of an LSP (which does not cause 
an allophone even synthesis is implemented by using interpolated LSPs), vector quantization of LSPs can be so con- 
ducted as not to produce an allophone even for an area like the top of a word where the spectrum varies significantly. 
It is possible to reduce an allophone in a synthesized speech which may occur when the quantization characteristic of 
an LSP becomes insufficient 

FIG. 17 presents a block diagram of the LSP quantizing/decoding section 152 according to this mode. The LSP 
quantizing/decoding section 1 52 has a gain information storage section 1 71 . an adaptive gain selector 1 72, a gain mul- 
tiplier 173, an LSP quantizing section 174 and an LSP decoding section 175. 

The gain information storage section 1 71 stores a plurality of gain candidates to be referred to at the time the adap- 
tive gain selector 172 selects the adaptive gain. The gain multiplier 173 multiplies a code vector, read from the LSP 
quantization table storage section 1307, by the adaptive gain selected by the adaptive gain selector 172. The LSP 
quantizing section 1 74 performs vector quantization of a quantization target LSP using the code vector multiplied by the 
adaptive gain. The LSP decoding section 1 75 has a function of decoding a vector-quantized LSP to generate a decoded 
LSP and outputting it, and a function of acquiring an LSP quantization error, which is a difference between the quanti- 
zation target LSP and the decoded LSP, and sending it to the adaptive gain selector 172. The adaptive gain selector 
1 72 acquires the adaptive gain by which a code vector is multiplied at the time of vector-quantizing the quantization tar- 
get LSP of the processing frame by adaptively adjusting the adaptive gain based on gain generation information stored 
in the gain information storage section 171 , on the basis of, as references, the level of the adaptive gain by which a code 
vector is multiplied at the time the quantization target LSP of the previous processing frame was vector-quantized and 
the LSP quantization error for the previous frame, and sends the obtained adaptive gain to the gain multiplier 173. 

The LSP quantizing/decocfing section 152 performs vector-quantizes and decodes a quantization target LSP while 
adaptively adjusting the adaptive gain by which a code vector is multiplied in the above manner. 

The LSP quantizing/decoding section 152 will now be discussed more specif icaily. The gain information storage 
section 171 is storing four gain candidates (0.9, i.0, 1.1 and 1.2) to which the adaptive gain selector 172 refers. The 
adaptive gain selector 1 72 acquires a reference value for selecting an adaptive gain, Slsp. from an equation 35 for divid- 
ing power ERpow, generated at the time of quantizing the quantization target LSP of the previous frame, by the square 
of an adaptive gain Gqlsp selected at the time of vector-quantizing the quantization target LSP of the previous process- 
ing frame. 

Sfsp.^ESUL (35) 
CqlSp 2 

where Slsp: reference value for selecting an adaptive gain 

EF^ow: quantization error power generated when quantizing the LSP of the previous frame 
Gqlsp: adaptive gain selected when vector-quantizing the LSP of the previous frame. 

One gain is selected from the four gain candidates (0.9, 1 .0. 1 . 1 and 1 .2). read from the gain information storage 
section 171, from an equation 36 using the acquired reference value Slsp for selecting the adaptive gain. Then, the 
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value of the selected adaptive gain Gqtep is sent to the gain multiplier 173. and information (2-bit information) for spec- 
ifying type of the selected adaptive gain from the four types is sent to the parameter coding section. 

Slsp > 0.0025 

Slsp > 0.0015 

Slsp > 0.0008 (36) 

Slsp s 0.0008 



Glsp 



1.2 
1.1 
1.0 
0.9 



is where Gtsp: adaptive gain by which a code vector for LS quantization is multiplied 
Slsp: reference value for selecting an adaptive gain. 

The selected adaptive gain Glsp and the error which has been produced in quantization are saved in the variable 
Gqlsp and ERpow until the quantization target LSP of the next frame is subjected to vector quantization. 

20 The gain multiplier 173 multiplies a code vector, read from the LSP quantization table storage section t307, by the 
adaptive gain selected by the adaptive gain selector 1 72, and sends the result to the LSP quantizing section 1 74. The 
LSP quantizing section 1 74 performs vector quantization on the quantization target LSP by using the code vector mul- 
tiplied by the adaptive gain, and sends its index to the parameter coding section. The LSP decoding section 175 
decodes the LSP, quantized by the LSP quantizing section 174, acquiring a decoded LSP. outputs this decoded LSP, 

25 subtracts the obtained decoded LSP from the quantization target LSP to obtain an LSP quantization error, computes 
the power ERpow of the obtained LSP quantization error, and sends the power to the adaptive gain selector 172. 

This mode can suppress an altophone in a synthesized speech which may be produced when the quantization 
characteristic of an LSP becomes insufficient. 

30 (Tenth Mode) 

FIG. 18 presents the structural blocks of an excitation vector generator according to this mode. This excitation vec- 
tor generator has a fixed waveform storage section 181 for storing three fixed waveforms (v1 (length: L1), v2 (length: 
L2) and v3 (length: L3)) of channels CH1 , CH2 and CH3, a fixed waveform arranging section 182 for arranging the fixed 
35 waveforms (vl , v2, v3), read from the fixed waveform storage section 1 81 , respectively at positions P 1 , P2 and P3, and 
an adding section 183 for adding the fixed waveforms arranged by the fixed waveform arranging section 182, generating 
an excitation vector. 

The operation of the thus constituted excitation vector generator will be discussed. 

Three fixed waveforms v1 , v2 and v3 are stored in advance in the fixed waveform storage section 1 81 . The fixed 
40 waveform arranging section 1 82 arranges (shifts) the fixed waveform v1 , read from the fixed waveform storage section 
181 , at the position PI selected from start position candidates for CH1 , based on start position candidate information 
for fixed waveforms it has as shown in Table 8, and likewise arranges the fixed waveforms v2 and v3 at the respective 
positions P2 and P3 selected from start position candidates for CH2 and CH3. 

45 



so 



55 
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table 8 



Channel 
number 


Sign 


start position candidate information 
for fixed waveform 


Lril 


• 

±1 


P1(0, 1 0. 2 0. 3 0. 6 0. 7 0) 


CH2 


±1 


/ 2, 1 2. 2 2, 3 2, 6 2, 7 2 \ 

P2 

\ 6, 1 6. 2 6, 3 6, 6 6, 7 6 / 


CH3 


±1 


/4, 1 4. 2 4, 3 4, 6 4, 7 4 \ 

P3 

\ 8, 1 8. 2 8. 3 8, 6 8, 7 8 ' 



20 

The adding section 183 adds the fixed waveforms, arranged by the fixed waveform arranging section 182, to gen- 
erate an excitation vector. 

It is to be noted that code numbers corresponding, one to one. to combination information of selectable start posi- 
es tion candidates of the indvidual fixed waveforms (information representing which positions were selected as P 1 , P2 and 
P3, respectively) should be assigned to the start position candidate information of the fixed waveforms the fixed wave- 
form arranging section 182 has. 

According to the excitation vector generator with the above structure, excitation information can be transmitted by 
transmitting code numbers correlating to the start position candidate information of fixed waveforms the fixed waveform 
30 arranging section 182 has, and the code numbers exist by the number of products of the individual start position can- 
didates, so that an excitation vector close to an actual speech can be generated. 

Since excitation information can be transmitted by transmitting code numbers, this excitation vector generator can 
be used as a random codebook in a speech coder/decoder 

While the description of this mode has been given with reference to a case of using three fixed waveforms as shown 
35 in FIG. 18, similar functions and advantages can be provided if the number of fixed waveforms (which coincides with 
the number of channels in FIG. 18 and Table 8) is changed to other values. 

Although the fixed waveform arranging section 182 in this mode has been described as having the start position 
candidate information of fixed waveforms given in Table 8, similar functions and advantages can be provided for other 
start position candidate information of fixed waveforms than those in Table 8. 

40 

(Eleventh Mode) 

FIG. 19A is a structural block diagram of a CELP type speech coder according to this mode, and FIG. 19B is a 
structural block diagram of a CELP type speech decoder which is paired with the CELP type speech coder. 
45 The CELP type speech coder according to this mode has an excitation vector generator which comprises a fixed 
waveform storage section 181 A, a fixed waveform arranging section 1 82A and an adding section 183A. The fixed wave- 
form storage section 181 A stores a plurality of fixed waveforms. The fixed waveform arranging section 182A arranges 
(shifts) fixed waveforms, read from the fixed waveform storage section 181 A, respectively at the selected positions, 
based on start position candidate information for fixed waveforms it has. The adding section 1 83 A adds the fixed wave- 
so forms, arranged by the fixed waveform arranging section 182A, to generate an excitation vector c. 

This CELP type speech coder has a time reversing section 191 for time-reversing a random codebook searching 
target x to be input, a synthesis filter 1 92 for synthesizing the output of the time reversing section 191 , a time reversing 
section 193 for time-reversing the output of the synthesis filter 192 again to yield a time-reversed synthesized target x\ 
a synthesis f flter 1 94 for synthesizing the excitation vector c multiplied by a random code vector gain gc, yielding a syn- 
55 thesized excitation vector s, a distortion calculator 205 for receiving x*, c and s and computing distortion, and a trans- 
mitter 196. 

According to this mode, the fixed waveform storage section 181 A, the fixed waveform arranging section 182A and 
the adding section 183A correspond to the fixed waveform storage section 181, the fixed waveform arranging section 
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182 and the adding section 183 shown in FIG. 18, the start position candidates of fixed waveforms in the individual 
channels correspond to those in Table 3. and channel numbers, fixed waveform numbers and symbols indicating the 
lengths and positions in use are those shown in FIG. 18 and Table 8. 

The CELP type speech decoder in FIG. 19B comprises a fixed waveform storage section 1 81 B for storing a plurality 

5 of fixed waveforms, a fixed waveform arranging section 182B for arranging (shifting) fixed waveforms, read from the 
fixed waveform storage section 181 B, respectively at the selected positions, based on start position candidate informa- 
tion for fixed waveforms it has. an adding section 183B for adding the fixed waveforms, arranged by the fixed waveform 
arranging section 182B, to yield an excitation vector c. a gain multiplier 197 for multiplying a random code vector gain 
gc, and a synthesis filter 198 for synthesizing the excitation vector c to yield a synthesized excitation vector s. 

10 The fixed waveform storage section 181 B and the fixed waveform arranging section 182B in the speech decoder 
have the same structures as the fixed waveform storage section 181 A and the fixed waveform arranging section 182A 
in the speech coder, and the fixed waveforms stored in the fixed waveform storage sections 181Aand l81Bhave such 
characteristics as to statistically minimize the cost function in the equation 3, which is the coding distortion computation 
of the equation 3 using a random codebook searching target by cost-function based learning. 

is The operation of the thus constituted speech coder will be discussed. 

The random codebook searching target x is time-reversed by the time reversing section 191 , then synthesized by 
the synthesis filter 192 and then time-reversed again by the time reversing section 193, and the result is sent as a time- 
reversed synthesized target x* to the distortion calculator 205. 

The fixed waveform arranging section 1 82A arranges (shifts) the fixed waveform v1 , read from the fixed waveform 

20 storage section 181 A, at the position P1 selected from start position cancfidates for CH1, based on start position can- 
didate information for fixed waveforms it has as shown in Table 8, and likewise arranges the fixed waveforms v2 and v3 
at the respective positions P2 and P3 selected from start position candidates for CH2 and CH3. The arranged fixed 
waveforms are sent to the adding section 183A and added to become an excitation vector c, which is input to the syn- 
thesis filter 194. The synthesis filter 194 synthesizes the excitation vector c to produce a synthesized excitation vector 

25 s and sends it to the distortion calculator 205. 

The distortion calculator 205 receives the time-reversed synthesized target x', the excitation vector c and the syn- 
thesized excitation vector s and corrputes coding distortion in the equation 4. 

The distortion calculator 205 sends a signal to the fixed waveform arranging section 182 A after computing the dis- 
tortion. The process from the selection of start position candidates corresponding to the three channels by the fixed 

30 waveform arranging section 182A to the distortion computation by the distortion calculator 205 is repeated for every 
combination of the start position candidates selectable by the fixed waveform arranging section 1 82A. 

Thereafter, the combination of the start position candidates that minimizes the coding distortion is selected, and the 
code number which corresponds, one to one, to that combination of the start position canddates and the then optimal 
random code vector gain gc are transmitted as codes of the random codebook to the transmitter 196. 

35 The fixed waveform arranging section 1 82B selects the positions of the fixed waveforms in the individual channels 
from start position candidate information for fixed waveforms it has, based on information sent from the transmitter 196, 
arranges (shifts) the fixed waveform vl , read from the fixed waveform storage section 181 B, at the position Pt selected 
from start position candidates for CH1 , and likewise arranges the fixed waveforms v2 and v3 at the respective positions 
P2 and P3 selected from start position candidates for CH2 and CH3. The arranged fixed waveforms are sent to the add- 

40 ing section 1 83 B and added to become an excitation vector c. This excitation vector c is multiplied by the random code 
vector gain gc selected based on the information from the transmitter 196, and the result is sent to the synthesis filter 
198. The synthesis filter 1 98 synthesizes the gc-multiplied excitation vector c to yield a synthesized excitation vector s 
and sends it out. 

According to the speech coder/decoder with the above structures, as an excitation vector is generated by the exci- 
45 tation vector generator which comprises the fixed waveform storage section, fixed waveform arranging section and the 
adding section, a synthesized excitation vector obtained by synthesizing this excitation vector in the synthesis filter has 
such a characteristic statistically close to that of an actual target as to be able to yield a high-quality synthesized 
speech, in addition to the advantages of the tenth mode. 

Although the foregoing description of this mode has been given with reference to a case where fixed waveforms 
so obtained by learning are stored in the fixed waveform storage sections 181 A and 181 B, high-quality synthesized 
speeches can also obtained even when fixed waveforms prepared based on the result of statistical analysis of the ran- 
dom codebook searching target x are used or when knowledge-based fixed waveforms are used. 

While the description of this mode has been given with reference to a case of using three fixed waveforms, similar 
functions and advantages can be provided if the number of fixed waveforms is changed to other values. 
55 Although the fixed waveform arranging section in this mode has been described as having the start position candi- 
date information of fixed waveforms given in Table 8. similar functions and advantages can be provided for other start 
position candidate information of fixed waveforms than those in Table 8. 
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(Twelfth Mode) 



FIG. 20 presents a structural block diagram of a CELP type speech coder according io this mode. 
This CELP type speech coder includes a fixed waveform storage section 200 for storing a plurality of fixed wave- 
5 forms (three in this mode: CH1:w1 , CH2:w2 and CH3w3), and a fixed waveform arranging section 201 which has start 
position candidate information of fixed waveforms for generating start positions of the fixed waveforms, stored in the 
fixed waveform storage section 200, according to algebraic rules. This CELP type speech coder further has a fixed 
waveform an impulse response calculator 202 for each waveform, an impulse generator 203, a correlation matrix cal- 
culator 204, a time reversing section 191 . a synthesis filter 192' for each waveform, a time reversing section 193 and a 
io distortion calculator 205. 

The impulse response calculator 202 has a function of convoluting three fixed waveforms from the fixed waveform 
storage section 200 and the impulse response h (length L = subframe length) of the synthesis filter to compute three 
kinds of impulse responses for the individual fixed waveforms (CH1:h1, CH2:h2 and CH3:h3, length L = subframe 
length). 

is The synthesis filter 192' has a function of convoluting the output of the time reversing section 191, which is the 
result of the time-reversing the random codecook searching target x to be input, and the impulse responses for the indi- 
vidual waveforms, hi , h2 and h3, from the impulse response calculator 202. 

The impulse generator 203 sets a pulse of an amplitude 1 (a polarity present) only at the start position candidates 
' P1 . P2 and P3, selected by the fixed waveform arranging section 201 , generating impulses for the individual channels 
20 (CH1 :d1 . CH2:d2 and CH3:d3). 

The correlation matrix calculator 204 computes autocorrelation of each of the impulse responses M , h2 and h3 for 
the individual waveforms from the impulse response calculator 202, and correlations between hi and h2, hi and h3. 
and h2 and h3, and develops the obtained correlation values in a correlation matrix RR 

The distortion calculator 205 specifies the random code vector that minimizes the coding distortion, from an equa- 
ls tion 37, a modification of the equation 4, by using three time-reversed synthesis targets (x*1 , x'2 and x*3), the correla- 
tion matrix RR and the three impulses (d1. d2 and d3) for the individual channels. 
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(37) 



3 3 
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where di: impulse (vector) for each channel 



d± = ±1 x 6 (k - p 4 ), 



40 



k = 0 to L-1 pi n start position candidates of the i-th channel 
Hj: impulse response convolution matrix for each waveform 



(H = HW ) 



45 



Wj: fixed waveform convolution matrix 
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where Wj is the fixed waveform (length: Lj) of the i-th channel 
xV vector obtained by time reverse synthesis of x using Hj 



(x 1 = xH ) 

t i 



Here, transformation from the equation 4 to the equation 37 is shown tor each of the denominator term (equation 
38) and the numerator term (equation 39). 



$5 
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{/He} 2 =(x ( H{W,d, + W 2 d 2 + W3O3)) 2 
= (/ f (H l d 1 + H 2 d 2 + H 3 d 3 )) 2 
= (U r H l )d 1 +(^ f H 2 )d 2+ (x r H 3 )d3) 2 
= (jr 1 d 1 + x 2 d 2 + x 3 d 3 ) 



7 , .2 



(38) 
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where x: random codebook searching target (vector) 
x 1 : transposed vector of x 

H: impulse response convolution matrix of the synthesis filter 
c: random code vector 



(c = H + Wd + Wd) 

ii 22 33 
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W : fixed waveform convolution matrix 

di: impulse (vector) for each channel 

Hj : impulse response convolution matrix for each waveform 

(H = HW ) 

i t 



x': vector obtained by time reverse synthesis of x using Hj 

U' ' =« tlK ) - 
1 i 



\\Hc\\ 2 =\\H(W,di + W 2 d 2 + W 3 d 3 )|| 2 (39) 

=||H l d 1 + H 2 d 2+ H 3 d 3 || 2 

= (H 1 cf 1 + H 2 d 2 + H 3 cy' (H 1 d l + H 2 d 2 + H 3 d 3 ) 

3 3 

where H: inrpulse response convolution matrix of the synthesis filter 
c: random code vector 

(c = Wldl + W2d2 + W3d3) 



Wj: fixed waveform convolution matrix 
4: impulse (vector) for each channel 

impulse response convolution matrix for each waveform 

(H = HW ) 

i i 



The operation of the thus constituted CELP type speech coder will be described. 

To begin with, the impulse response calculator 202 convolutes three fixed waveforms stored and the impulse 
response h to compute three kinds of impulse responses hi, h2 and h3 for the individual fixed waveforms, and sends 
them to the synthesis filter 1 92' and the correlation matrix calculator 204. 

Next, the synthesis filter 192' convolutes the random codebook searching target x, time-reversed by the time 
reversing section 1 91 , and the input three kinds of impulse responses hi , h2 and h3 for the individual waveforms. The 
time reversing section 193 time-reverses the three kinds of output vectors from the synthesis filter 192' again to yield 
three time-reversed synthesis targets x1 , x^ and x*3, and sends them to the distortion calculator 205. 
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Then, the correlation matrix calculator 204 computes autocorrelations of each of the input three kinds of impulse 
responses hi. h2 and h3 for the individual waveforms and correlations between hi and h2, hi and h3, and h2 and h3, 
and sends the obtained autocorrelations and correlations value to the distortion calculator 205 after developing them in 
the correlation matrix RR. 

5 The above process having been executed as a pre-process, the fixed waveform arranging section 20 1 selects one 
start position candidate of a fixed waveform for each channel, and sends the positional information to the impulse gen- 
erator 203. 

The impulse generator 203 sets a pulse of an amplitude 1 (a polarity present) at each cf the start position candi- 
dates, obtained from the fixed waveform arranging section 201 . generating impulses d1. d2 and d3 for the incfividual 
w channels and sends them to the distortion calculator 205. 

Then, the distortion calculator 205 computes a reference value for minimizing the coding distortion in the equation 
37, by using three time-reversed synthesis targets x f 1. x'2 and x'3 for the individual waveforms, the correlation matrix 
RR and the three impulses d1 , d2 and d3 for the individual channels. 

The process from the selection of start position candidates corresponding to the three channels by the fixed wave- 
rs form arranging section 201 to the distortion computation by the distortion calculator 205 is repeated fa every combina- 
tion of the start position candidates selectable by the fixed waveform arranging section 201 . Then, code number which 
corresponds to the combination of the start position candidates that minimizes the reference value for searching the 
coding distortion in the equation 37 and the then optimal gain are specified with the random code vector gain gc used 
as a code of the random codebook. and are transmitted to the transmitter. 
20 The speech decoder of this mode has a similar structure to that of the tenth mode in FIG. 1 9B, and the fixed wave- 
form storage section and the fixed waveform arranging section in the speech coder have the same structures as the 
fixed waveform storage section and the fixed waveform arranging section in the speech decoder. The fixed waveforms 
stored in the fixed waveform storage section is a fixed waveform having such characteristics as to statistically minimize 
the cost function in the equation 3 by the training using the coding distortion equation (equation 3) with a random code- 
cs book searching target as a cost-function. 

According to the thus constructed speech coder/decoder when the start position candidates of fixed waveforms in 
the fixed waveform arranging section can be computed algebraically, the numerator in the equation 37 can be computed 
by adding the three terms of the time-reversed synthesis target for each waveform, obtained in the previous processing 
stage, and then obtaining the square of the result. Further, the numerator in the equation 37 can be computed by adding 
30 the nine terms in the correlation matrix cf the impulse responses of the individual waveforms obtained in the previous 
processing stage. This can ensure searching with about the same amount of computation as needed in a case where 
the conventional algebraic structural excitation vector (an excitation vector is constituted by several pulses of an ampli- 
tude 1) is used for the random codebook. 

Furthermore, a synthesized excitation vector in the synthesis filter has such a characteristic statistically close to 
35 that of an actual target as to be able to yield a high-quality synthesized speech. 

Although the foregoing description of this mode has been given with reference to a case where fixed waveforms 
obtained through training are stored in the fixed waveform storage section, higrvquaiity synthesized speeches can also 
obtained even when fixed waveforms prepared based on the result of statistical analysis of the random codebook 
searching target x are used or when knowledge-based fixed waveforms are used. 
40 While the description of this mode has been given with reference to a case of using three fixed waveforms, similar 
functions and advantages can be provided if the number of fixed waveforms is changed to other values. 

Although the fixed waveform arranging section in this mode has been described as having the start position candi- 
date information of fixed waveforms given in Table 8, similar functions and advantages can be provided for other start 
position candidate information of fixed waveforms than those in Table 8. 

46 

(Thirteenth Mode) 

FIG. 21 presents a structural block diagram of a CELP type speech coder according to this mode. The speech 
coder according to this mode has two kinds of random codebooks A 21 1 and B 212, a switch 213 for switching the two 
so kinds of random codebooks from one to the other, a multiplier 2U for multiplying a random code vector by a gain, a 
synthesis filter 215 for synthesizing a random code vector output from the random codebook that is connected by 
means of the switch 21 3, and a distortion calculator 2 1 6 for computing coding distortion in the equation 2. 

The random codebook A 21 1 has the structure of the excitation vector generator of the tenth mode, while the other 
random codebook B 212 is constituted by a random sequence storage section 217 storing a plurality of random code 
55 vectors generated from a random sequence. Switching between the random codebooks is carried out in a dosed loop. 
The x is a random codebook searching target. 

The operation of the thus constituted CELP type speech coder will be discussed 

First the switch 213 is connected to the random codebook A 21 1, and the fixed waveform arranging section 182 
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arranges (shifts) the fixed waveforms, read from the fixed waveform storage section 181, at the positions selected from 
start position candidates of fixed waveforms respectively, based on start position candidate information for fixed wave- 
forms it has as shown in Table 8. The arranged fixed waveforms are added together in the adding section 183 to 
become a random code vector, which is sent to the synthesis fflter 215 after being multiplied by the random code vector 
5 gain. The synthesis filter 215 synthesizes the input random code vector and sends the result to the distortion calculator 
216. 

The distortion calculator 216 performs minimization of the coding distortion in the equation 2 by using the random 
codebook searching target x and the synthesized code vector obtained from the synthesis filter 215 

After computing the distortion, the distortion calculator 216 sends a signal to the fixed waveform arranging section 
w 1 82. The process from the selection of start position candidates corresponding to the three channels by the fixed wave- 
form arranging section 182 to the distortion computation by the dstortion calculator 216 is repeated for every combina- 
tion of the start position candidates selectable by the fixed waveform arranging section 182. 

Thereafter, the combination of the start position candidates that minimizes the coding distortion is selected, and the 
code number which corresponds, one to one, to that combination of the start position candidates, the then optimal ran- 
is dom code vector gain gc and the minimum coding distortion value are memorized. 

Then, the switch 21 3 is connected to the random codebook B 212, causing a random sequence read from the ran- 
dom sequence storage section 217 to become a random code vector. This random code vector, after being multiplied 
by the random code vector gain, is input to the synthesis filter 215. The synthesis filter 215 synthesizes the input ran- 
dom code vector and sends the result to the distortion calculator 216. 
20 The distortion calculator 216 computes the coding cfistortion in the equation 2 by using the random codebook 
searching target x and the synthesized code vector obtained from the synthesis fflter 215. 

After computing the distortion, the distortion calculator 216 sends a signal to the random sequence storage section 
21 7. The process from the selection of the random code vector by the random sequence storage section 21 7 to the dis- 
tortion computation by the distortion calculator 216 is repeated for every random code vector selectable by the random 
25 sequence storage section 2 1 7. 

Thereafter, the random code vector that minimizes the coding distortion is selected, and the code number of that 
random code vector, the then optimal random code vector gain gc and the minimum coding distortion value are mem- 
orized. 

Then, the distortion calculator 216 compares the minimum coding distortion value obtained when the switch 213 is 
30 connected to the random codebook A 21 1 with the minimum coding distortion value obtained when the switch 213 is 
connected to the random codebook B 212, determines switch connection information when smaller coding distortion 
was obtained, the then code number and the random code vector gain are determined as speech codes, and are sent 
to an uniilustrated transmitter. 

The speech decoder according to this mode which is paired with the speech coder of this mode has the random 
35 codebook A, the random codebook B, the switch, the random code vector gain and the synthesis filter having the same 
structures and arranged in the same way as those in FIG. 21 , a random codebook to be used, a random code vector 
and a random code vector gain are determined based on a speech code input from the transmitter, and a synthesized 
excitation vector is obtained as the output of the synthesis filter. 

According to the speech coder/decoder with the above structures, one of the random code vectors to be generated 
40 from the random codebook A and the random code vectors to be generated from the random codebook B, which mini- 
mizes the coding distortion in the equation 2, can be selected in a closed loop, making it possible to generate an exci- 
tation vector closer to an actual speech and a high-quality synthesized speech. 

Although this mode has been illustrated as a speech coder/decoder based on the structure in FIG. 2 of the conven- 
tional CELP type speech coder, similar functions and advantages can be provided even if this mode is adapted to a 
45 CELP type speech coder/decoder based on the structure in FIGS. 19A and 19B or FIG. 20. 

Although the random codebook A 21 1 in this mode has the same structure as shown in FIG. 18. similar functions 
and advantages can be provided even if the fixed waveform storage section 181 takes another structure (e.g.. in a case 
where it has four fixed waveforms). 

While the description of this mode has been given with reference to a case where the fixed waveform arranging 
so section 182 of the random codebook A 21 1 has the start position candidate information of fixed waveforms as shown 
in Table 8. similar functions and advantages can be provided even tor a case where the section 1 82 has other start posi- 
tion candidate information of fixed waveforms. 

Although this mode has been described with reference to a case where the random codebook B 212 is constituted 
by the random sequence storage section 21 7 for directly storing a plurality of random sequences in the memory, similar 
55 functions and advantages can be provided even for a case where the random codebook B 21 2 takes other excitation 
vector structures (e.g., when it is constituted by excitation vector generation information with an algebraic structure). 

Although this mode has been described as a CELP type speech coder/decoder having two kinds of random code- 
books, similar functions and advantages can be provided even in a case of using a CELP type speech coder/decoder 
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having three or mora kinds of random codebooks. 
(Fourteenth Mode) 

5 FIG. 22 presents a structural block diagram of a CELP type speech coder according to this mode. The speech 
coder according to this mode has two kinds of random codebooks. One random codebook has the structure of the exci- 
tation vector generator shown in FIGL 18, and the other one is constituted of a pulse sequences storage section which 
retains a plurality of pulse sequences. The random codebooks are adaptively switched from one to the other by using 
a quantized pitch gain already acquired before random codebook search. 

10 The random codebook A 211, which comprises the fixed waveform storage section 181, fixed waveform arranging 
section 182 and adding section 183. corresponds to the excitation vector generator in FIG. 18. A random codebook B 
221 is comprised of a pulse sequences storage section 222 where a plurality of pulse sequences are stored. The ran- 
dom codebooks A 21 1 and B 221 are switched from one to the other by means of a switch 21 3*. A multiplier 224 outputs 
an adaptive code vector which is the output of an adaptive codebook 223 multiplied by the pitch gain that has already 

is been acquired at the time of random codebook search. The output of a pitch gain quantizer 225 is given to the switch 
213'. 

The operation of the thus constituted CELP type speech coder will be described. 

According to the conventional CELP type speech coder, the adaptive codebook 223 is searched first, and the ran- 
dom codebook search is carried out based on the result. This adaptive codebook search is a process of selecting an 
20 optimal adaptive code vector from a plurafity of adaptive code vectors stored in the adaptive codebook 223 (vectors 
each obtained by multiplying an adaptive code vector and a random code vector by their respective gains and then add- 
ing them together). As a result of the process, the code number and pitch gain of an adaptive code vector are gener- 
ated. 

According to the CELP type speech coder of this mode, the pitch gain quantizer 225 quantizes this pitch gain, gen- 
25 erating a quantized pitch gain, after which random codebook search will be performed. The quantized pitch gain 
obtained by the pitch gain quantizer 225 is sent to the switch 213' for switching between the random codebooks. 

The switch 213' connects to the random codebook A 21 1 when the value of the quantized pitch gain is small, by 
which it is considered that the input speech is unvoiced, and connects to the random codebook B 221 when the value 
of the quantized pitch gain is large, by which it is considered that the input speech is voiced. 
30 When the switch 213' is connected to the random codebook A 21 1, the fixed waveform arranging section 182 
arranges (shifts) the fixed waveforms, read from the fixed waveform storage section 181, at the positions selected from 
start position candidates of fixed waveforms respectively, based on start position candidate information for fixed wave- 
forms it has as shown in Table 8. The arranged fixed waveforms are sent to the adding section 183 and added together 
to become a random code vector. The random code vector is sent to the synthesis filter 215 after being multiplied by 
3$ the random code vector gain. The synthesis filter 215 synthesizes the input random code vector and sends the result 
to the distortion calculator 216. 

The distortion calculator 216 computes coding distortion in the equation 2 by using the target x for random code- 
book search and the synthesized code vector obtained from the synthesis filter 21 5. 

After computing the distortion, the distortion calculator 216 sends a signal to the fixed waveform arranging section 
40 182. The process from the selection of start position candidates corresponding to the three channels by the fixed wave- 
form arranging section 182 to the distortion computation by the distortion calculator 21 6 is repeated for every combina- 
tion of the start position candidates selectable by the fixed waveform arranging section 182. 

Thereafter, the combination of the start position candidates that minimizes the coding distortion is selected, and the 
code number which corresponds, one to one. to that combination of the start position candidates, the then optimal ran- 
4$ dorn code vector gain gc and the quantized pitch gain are transferred to a transmitter as a speech code. In this mode, 
the property of unvoiced sound should be reflected on fixed waveform patterns to be stored in the fixed waveform stor- 
age section 181, before speech coding takes places. 

When the switch 213' is connected to the random codebook B 212, a pulse sequence read from the pulse 
sequences storage section 222 becomes a random code vector. This random code vector is input to the synthesis filter 
so 215 through the switch 21 3* and multiplication of the random code vector gain. The synthesis filter 215 synthesizes the 
input random code vector and sends the result to the distortion calculator 216. 

The distortion calculator 216 computes the coding distortion in the equation 2 by using the target x tor random 
codebook search X and the synthesized code vector obtained from the synthesis filter 215. 

After computing the distortion, the distortion calculator 216 sends a signal to the pulse sequences storage section 
55 222. The process from the selection of the random code vector by the pulse sequences storage section 222 to the dis- 
tortion computation by the distortion calculator 216 is repeated for every random code vector selectable by the pulse 
sequences storage section 222. 

Thereafter, the random code vector that minimizes the coding distortion is selected, and the code number of that 
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random code vector, the then optima! random code vector gain gc and the quantized pitch gain are transferred to the 
transmitter as a speech code. 

The speech decoder according to this mode which is paired with the speech coder of this mode has the random 
codebook A, the random codebook B, the switch, the random code vector gain and the synthesis filter having the same 

s structures and arranged in the same way as those in FIG. 22. First, upon reception of the transmitted quantized pitch 
gain, the coder side determines from its level whether the switch 213' has been connected to the random codebook A 
21 1 or to the random codebook B 221 . Next, based on the code number and the sign of the random code vector, a syn- 
thesized excitation Vector is obtained as the output of the synthesis filter. 

According to the speech coder/decoder with the above structures, two kinds of random codebooks can be switched 

w adaptively in accordance with the characteristic of an input speech (the level of the quantized pitch gain is used to deter- 
mine the transmitted quantized pitch gain in this mode), so that when the input speech is voiced, a pulse sequence can 
be selected as a random code vector whereas for a strong voiceless property, a random code vector which reflects the 
property of voiceless sounds can be selected. This can ensure generation of excitation vectors closer to the actual 
sound property and improvement of synthesized sounds Because switching is performed in a closed loop in this mode 

is as mentioned above, the functional effects can be improved by increasing the amount of information to be transmitted. 
Although this mode has been illustrated as a speech coder/decoder based on the structure in FIG. 2 of the conven- 
tional CELP type speech coder, similar functions and advantages can be provided even if this mode is adapted to a 
CELP type speech coder/decoder based on the structure in FIGS. 19A and 19B or FIG. 20. 

In this mode, a quantized pitch gain acquired by quantizing the pitch gain of an adaptive code vector in the pitch 

20 gain quantizer 225 is used as a parameter for switching the switch 213'. A pitch period calculator may be provided so 
that a pitch period computed from an adaptive code vector can be used instead. 

Although the random codebook A 21 1 in this mode has the same structure as shown in FIG. 18, similar functions 
and advantages can be provided even if the fixed waveform storage section 1 81 takes another structure (e.g., in a case 
where it has four fixed waveforms). 

25 While the description of this mode has been given with reference to the case where the fixed waveform arranging 
section 182 of the random codebook A 21 1 has the start position candidate information of fixed waveforms as shown 
in Table 8, similar functions and advantages can be provided even for a case where the section 1 82 has other start posi- 
tion candidate information of fixed waveforms. - 

Although this mode has been described with reference to the case where the random codebook B 212 is consti- 

30 tuted by the pulse sequences storage section 222 for directly storing a pulse sequence in the memory, similar functions 
and advantages can be provided even for a case where the random codebook B 21 2 takes other excitation vector struc- 
tures (e.g., when it is constituted by excitation vector generation information with an algebraic structure). 

Although the mode has been described as a CELP type speech coder/decoder having two kinds of random code- 
books, similar functions and advantages can be provided even in a case of using a CELP type speech coder/decoder 

35 having three or more kinds of random codebooks. 

(Fifteenth Mode) 

FIG. 23 presents a structural block diagram of a CELP type speech coder according to this mode. The speech 
40 coder according to this mode has two kinds of random codebooks. One random codebook takes the structure of the 
excitation vector generator shown in FIG. 18 and has three fixed waveforms stored in the fixed waveform storage sec- 
tion, and the other one likewise takes the structure of the excitation vector generator shown in FIG. 18 but has two fixed 
waveforms stored in the fixed waveform storage section. Those two kinds of random codebooks are switched in a 
closed loop. 

45 The random codebook A 21 1 , which comprises a fixed waveform storage section A 1 81 having three fixed wave- 
forms stored therein, fixed waveform arranging section A 182 and adding section 183, corresponds to the structure of 
the excitation vector generator in FIG. 18 which however has three fixed waveforms.stored in the fixed waveform stor- 
age section. 

A random codebook B 230 comprises a fixed waveform storage section B 231 having two fixed waveforms stored 
so therein, fixed waveform arranging section B 232 having start position candidate information of fixed waveforms as 
shown in Table 9 and adding section 233, which adds two fixed waveforms, arranged by the fixed waveform arranging 
section B 232, thereby generating a random code vector. The random codebook B 230 corresponds to the structure of 
the excitation vector generator in FIG. 18 which however has two fixed waveforms stored in the fixed waveform storage 
section. 

55 
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Table 9 



Channel 
number 


Sign 


Channel number Sign Start position 

candidates tucea waverorms 


CHI 


±1 


f 0. 4, 8. L 2. 1 6. 7 2, 7 6 
Pll 

v 2, 6, 10, 14, 1 8. 74,7 8 


CH2 


±1 


.1. 5. 9, 1 3, 1 7, 7 3, 7 7 

P2 

*3, 7, 11, 15, 19, 7 5, 7 9 



20 The other structure is the same as that of the above-described thirteenth mode. 

The operation of the CELP type speech coder constructed in the above way will be described. 
First the switch 21 3 is connected to the random codebook A 211 , and the f bced waveform arranging section A 1 82 
arranges (shifts) three fixed waveforms, read from the fixed waveform storage section A 181 , at the positions selected 
from start position candidates of fixed waveforms respectively, based on start position candidate information for fixed 
25 waveforms it has as shown in Table 8. The arranged three fixed waveforms are output to the adding section 183 and 
added together to become a random code vector. This random code vector is sent to the synthesis filter 215 through 
the switch 213 and the multiplier 214 for multiplying it by the random code vector gain. The synthesis filter 215 synthe- 
sizes the input random code vector and sends the result to the distortion calculator 216. 

The distortion calculator 216 computes coding distortion in the equation 2 by using the random codebook search 
30 target X and the synthesized code vector obtained from the synthesis filter 21 5. 

After computing the distortion, the distortion calculator 216 sends a signal to the fixed waveform arranging section 
A 182. The process from the selection of start position candidates corresponding to the three channels by the fixed 
waveform arranging section A 182 to the distortion computation by the distortion calculator 216 is repeated for every 
combination of the start position candidates selectable by the fixed waveform arranging section A 182. 
35 Thereafter, the combination of the start position candidates that minimizes the coding distortion is selected, and the 
code number which corresponds, one to one, to that combination of the start position candidates, the then optimal ran- 
dom code vector gain gc and the minimum coding distortion value are memorized. 

In this mode, the fixed waveform patterns to be stored in the fixed waveform storage section A 181 before speech 
coding are what have been acquired through training in such a way as to minimize distortion under the condition of three 
40 fixed waveforms in use. 

Next, the switch 213 is connected to the random codebook B 230, and the fixed waveform arranging section B232 
arranges (shifts) two fixed waveforms, read from the fixed waveform storage section B 231 , at the positions selected 
from start position candidates of fixed waveforms respectively, based on start position candidate information for fixed 
waveforms it has as shown in Table 9. The arranged two fixed waveforms are output to the adding section 233 and 

45 added together to become a random code vector. This random code vector is sent to the synthesis filter 215 through 
the switch 213 and the multiplier 214 for multiplying it by the random code vector gain. The synthesis titter 215 synthe- 
sizes the input random code vector and sends the result to the distortion calculator 216, 

The distortion calculator 216 computes coding distortion in the equation 2 by using the target x for random code- 
book search X and the synthesized code vector obtained from the synthesis filter 215. 

so After computing the distortion, the distortion calculator 216 sends a signal to the fixed waveform arranging section 
B 232. The process from the selection of start position candidates corresponding to the three channels by the fixed 
waveform arranging section B 232 to the distortion computation by the distortion calculator 216 is repeated for every 
combination of the start position candidates selectable by the fixed waveform arranging section B 232. 

Thereafter, the combination of the start position candidates that minimizes the coding distortion is selected, and the 

55 code number which corresponds, one to one, to that combination of the start position candidates, the then optimal ran- 
dom code vector gain gc and the minimum coding distortion value are memorized. In this mode, the fixed waveform pat- 
terns to be stored in the fixed waveform storage section B 231 before speech coding are what have been acquired 
through training in such a way as to minimize distortion under the condition of two fixed waveforms in use. 
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Then, the distortion calculator 216 compares the minimum coding distortion value obtained when the switch 213 is 
connected to the random codebook B 230 with the minimum coding distortion value obtained when the switch 213 is 
connected to the random codebook B 212, determines switch connection information when smaller coding distortion 
was obtained, the then code number and the random code vector gain are determined as speech codes, and are sent 
5 to the transmitter. 

The speech decoder according to this mode has the random codebook A, the random codebook B, the switch, the 
random code vector gain and the synthesis filter having the same structures and arranged in the same way as those in 
FIG. 23, a random codebook to be used, a random code vector and a random code vector gain are determined based 
on a speech code input from the transmitter, and a synthesized excitation vector is obtained as the output ol the syn- 
w thesis filter. 

According to the speech coder/decoder with the above structures, one of the random code vectors to be generated 
from the random codebook A and the random code vectors to be generated from the random codebook B, which mini- 
mizes the coding distortion in the equation 2, can be selected in a closed loop, making it possible to generate an exci- 
tation vector closer to an actual speech and a high-quality synthesized speech. 
is Although this mode has been illustrated as a speech coder/decoder based on the structure in FIG. 2 of the conven- 
tional CELP type speech coder, similar functions and advantages can be provided even if this mode is adapted to a 
CELP type speech coder/decoder based on the structure in FIGS. 19A and 19B or FIG. 20. 

Although this mode has been described with reference to the case where the fixed waveform storage section A 1 81 
of the random codebook A 21 1 stores three fixed waveforms, similar functions and advantages can be provided even if 
20 the fixed waveform storage section A 1 81 stores a different number of fixed waveforms (e.g. , in a case where it has four 
fixed waveforms). The same is true of the random codebook B 230. 

While the description of this mode has been given with reference to the case where the fixed waveform arranging 
section A 182 of the random codebook A 21 1 has the start position candidate information of fixed waveforms as shown 
in Table 8, similar functions and advantages can be provided even for a case where the section 1 82 has other start posi- 
25 tion candidate information of fixed waveforms. The same is applied to the random codebook B 230. 

Although this mode has been described as a CELP type speech coder/decoder having two kinds of random code- 
books, similar functions and advantages can be provided even in a case of using a CELP type speech coder/decoder 
having three or more kinds of random codebooks. 

30 (Sixteenth Mode) 

FIG. 24 presents a structural block diagram of a CELP type speech coder according to this mode. The speech 
coder acquires LPC coefficients by performing autocorrelation analysis and LPC analysis on input speech data 241 in 
an LPC analyzing section 242, encodes the obtained LPC coefficients to acquire LPC codes, and encodes the obtained 
35 LPC codes to yield decoded LPC coefficients. 

Next, an excitation vector generator 245 acquires an adaptive code vector and a random code vector from an adap- 
tive codebook 243 and an excitation vector generator 244, and sends them to an LPC synthesis filter 246. One of the 
excitation vector generators of the above-described first to fourth and tenth modes is used for the excitation vector gen- 
erator 244. Further, the LPC synthesis filter 246 filters two excitation vectors, obtained by the excitation vector generator 
40 245, with the decoded LPC coefficients obtained by the LPC analyzing section 242, thereby yielding two synthesized 
speeches. 

A comparator 247 analyzes a relationship between the two synthesized speeches, obtained by the LPC synthesis 
filter 246, and the input speech, yielding optimal values (optimal gains) of the two synthesized speeches, adds the syn- 
thesized speeches whose powers have been adjusted with the optimal gains, acquiring a total synthesized speech, and 
45 then computes a distance between the total synthesized speech and the input speech. 

Distance computation is also carried out on the input speech and multiple synthesized speeches, which are 
obtained by causing the excitation vector generator 245 and the LPC synthesis filter 246 to function with respect to all 
the excitation vector samples those are generated by the random codebook 243 and the excitation vector generator 
244. Then, the index of the excitation vector sample which provides the minimum one of the distances obtained from 
so the computation. The obtained optimal gains, the obtained index of the excitation vector sample and two excitation vec- 
tors corresponding to that index are sent to a parameter coding section 248. 

The parameter coding section 248 encodes the optimal gains to obtain gain codes, and the LPC codes and the 
index of the excitation vector sample are all sent to a transmitter 249. An actual excitation signal is produced from the 
gain codes and the two excitation vectors corresponding to the index, and an old excitation vector sample is discarded 
55 at the same time the excitation signal is stored in the adaptive codebook 243. 

FIG. 25 shows functional blocks of a section in the parameter coding section 248, which is associated with vector 
quantization of the gain. 

The parameter coding section 248 has a parameter converting section 2502 for converting input optimal gains 2501 



37 



EP 0 883 107 A1 



to a sum of elements and a ratio with respect to the sum to acquire quantization target vectors, a target vector extracting 
section 2503 for obtaining a target vector by using old decoded code vectors, stored in a decoded vector storage sec- 
tion, and predictive coefficients stored in a predictive coefficients storage section, a decoded vector storage section 
2504 where old decoded code vectors are stored, a predictive coefficients storage section 2505, a distance calculator 
2506 for computing distances between a plurality of code vectors stored in a vector codebook and a target vector 
obtained by the target vector extracting section by using predictive coefficients stored in the predictive coefficients stor- 
age section, a vector codebook 2507 where a plurality of code vectors are stored, and a comparator 2508. which con- 
trols the vector codebook and the distance calculator for comparison of the distances obtained from the distance 
calculator to acquire the number of the most appropriate code vector, acquires a code vector from the vector storage 
section based on the obtained number, and updates the content of the decoded vector storage section using that code 
vector. 

A detailed description will now be given of the operation of the thus constituted parameter coding section 248. The 
vector codebook 2507 where a plurality of general samples (code vectors) of a quantization target vector are stored 
should be prepared in advance. This is generally prepared by an LBG algorithm (IEEE TRANSACTIONS ON COMMU- 
NICATIONS, VOL COM-28, NO. 1. PP 84-95, JANUARY 1980) based on multiple vectors which are obtained by ana- 
lyzing multiple speech data. 

Coefficients for predictive coding should be stored in the predictive coefficients storage section 2505. TTie predic- 
tive coefficients will now be discussed after describing the algorithm. A value indicating a unvoiced stateshould be 
stored as an initial value in the decoded vector storage section 2504. One example would be a code vector with the low- 
est power. 

First the input optimal gains 2501 (the gain of an adaptive excitation vector and the gain of a random excitation vec- 
tor) are converted to element vectors (inputs) of a sum and a ratio in the parameter converting section 2502. The con- 
version method is illustrated in an equation 40. 

P=log(Ga + Gs) (40) 
R = Ga/(Ga + Gs) 

where (Ga, Gs): optical gain 

Ga: gain of an adaptive excitation vector 

Gs: gain of stochastic excitation vector 

(P, R): input vectors 

P: sum 

R: ratia 

It is to be noted that Ga above should not necessarily be a positive value. Thus, R may take a negative value. When 
Ga + Gs becomes negative, a fixed value prepared in advance is substituted. 

Next, based on the vectors obtained by the parameter converting section 2502, the target vector extracting section 
2503 acquires a target vector by using old decoded code vectors, stored in the decoded vector storage section 2504, 
and predictive coefficients stored in the predictive coefficients storage section 2504. An equation for computing the tar- 
get vector is given by an equation 41. 

/ / 

Tp*P -(£ * P' + Z V P* * "1 < 41 ) 

ioi M 
/ / 

Tr = R -(£/>/ x pi + £ Vri x n) 



where (Tp, Tr): target vector 

(P, R): input vector 

(pi, ri): old decoded vector 

Upi, Vpi, Uri, Vri: predictive coefficients (fixed values) 
i: index indicating how old the decoded vector is 
I: prediction order. 
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Then, the Distance calculator 2506 computes a distance between a target vector obtained by the target vector 
extracting section 2503 and a code vector stored in the vector codebook 2507 by using the predictive coefficients stored 
in the predictive coefficients storage section 2505. An equation for computing the distance is given by an equation 42. 

Dn = Wp x (Tp - UpO x Cpn - VpO x Cm) 2 + Wr x (Tr - UpO x Con - VrO :< Crn) 2 (42) 

where Dn: distance between a target vector and a code vector 
(Tp, Tr): target vector 

UpO, VpO. UrO, VrO: predictive coefficients (fixed values) 

(Cpn, Crn): code vector 

n: the number of the code vector 

Wp, Wr: weighting coefficient (fixed) for adjusting the sensitivity against distortion. 

Then, the comparator 2508 controls the vector codebook 2507 and the distance calculator 2506 to acquire the 
number of the code vector which has the shortest distance computed by the distance calculator 2506 from among a 
plurality of code vectors stored in the vector codebook 2507, and sets the number as a gain code 2509. Based on the 
obtained gain code 2509, the comparator 2508 acquires a decoded vector and updates the content of the decoded vec- 
tor storage section 2504 using that vector. An equation 43 shows how to acquire a decoded vector. 

P = (£Up/ xp/ + J Vpi x ri) + UpO x Cpn + VpO x Crn (43) 

/ / 

R = (£(7/7 x pi + £ Vri x ri) + UrO x Cpn + VrO x Cm 

where (Cpn. Crn): code vector 
(R r): decoded vector 
(pi, ri): old decoded vector 

Upi, Vpi, Uri, Vri: predictive coefficients (fixed values) 
i: index indicating how old the decoded vector is 
I: prediction order, 
n: the number of the code vector. 

An equation 44 shows an updating scheme. 

Processing order 

pO = CpN (44) 

rO = CrN 
pi = pi - 1 (i = 1 - 1) 
ri = ri-l (i = 1 ~ 1) 

N: code of the gain. 

Meanwhile, the decoder, which shoutd previously be provided with a vector codebook, a predictive coefficients stor- 
age section and a coded vector storage section similar to those of the coder, performs decoding through the functions 
ol the comparator of the coder of generating a decoded vector and updating the decoded vector storage section, based 
on the gain code transmitted from the coder. 

A scheme of setting predictive coefficients to be stored in the predictive coefficients storage section 2505 will now 
be described. 

Predictive coefficients are obtained by quantizing a lot of training speech data first, collecting input vectors obtained 
from their optimal gains and decoded vectors at the time of quantization, forming a population, then minimizing total dis- 
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tortion indicated by the following equation *5 for that population. Specifically, the values of Upi and Uri are acquired by 
solving simultaneous equations which are derived by partial differential of the equation of the total distortion with 
respect to Upi and UrL 



r 

Total = £ 



W0x(Pf-£UP/xpr,/) 2 + WrxiRt-YtUri* rtj) 2 \ 

InO m> > 



(45) 



pt,0 = Cpn {() 

10 

rt,0 = Crn (() 

where Total: total distortion 

t: time (frame number) 
is T: the number of pieces of data in the population 

(Pt Rt): optimal gain at time t 

(pti. rti): decoded vector at time t 

Upi, Vpi, Uri, Vri: predictive coefficients (fixed values) 

i: index indicating how old the decoded vector is 
20 I: prediction order. 

(Cpn^, Crn^: code vector at time t 

n: the number of the code vector 

Wp, Wr: weighting coefficient (fixed) for adjusting the sensitivity against distortion. 

25 According to such a vector quantization scheme, the optimal gain can be vector-quantized as it is, the feature of the 
parameter converting section can permit the use of the correlation between the relative levels of the power and each 
gain, and the features of the decoded vector storage section, the predictive coefficients storage section, the target vec- 
tor extracting section and the distance calculator can ensure predictive coding of gains using the correlation between 
the mutual relations between the power and two gains. Those features can allow the correlation among parameters to 

30 be utilized sufficiently. 



(Seventeenth Mode) 

FIG. 26 presents a structural block diagram of a parameter coding section of a speech coder according to this 
35 mode. According to this mode, vector quantization is performed while evaluating gain-quantization originated distortion 
from two synthesized speeches corresponding to the index of an excitation vector and a perpetual weighted input 
speech. 

As shown in FIG. 26, the parameter coding section has a parameter calculator 2602, which computes parameters 
necessary for distance computation from input data or a perpetual weighted input speech, a perpetual weighted LPC 

40 synthesis of adaptive code vector and a perpetual weighted LPC synthesis of random code vecror 2601 to be input, a 
decoded vector stored in a decodng vector storage section, and predictive coefficients stored in a predictive coeffi- 
cients storage section, a decoded vector storage section 2603 where old decoded code vectors are stored, a predctive 
coefficients storage section 2604 where predictive coefficients are stored, a distance calculator 2605 for computing 
coding distortion of the time when decoding is implemented with a plurality of code vectors stored in a vector codebook 

45 by using the predictive coefficients stored in the predictive coefficients storage section, a vector codebook 2606 where 
a plurality of code vectors are stored, and a comparator 2607, which controls the vector codebook and the distance cal- 
culator for comparison of the coding distortions obtained from the distance calculator to acquire the number of the most 
appropriate code vector, acquires a code vector from the vector storage section based on the obtained number, and 
updates the content of the decoded vector storage section using that code vector. 

so A description will now be given of the vector quantizing operation of the thus constituted parameter coding section. 
The vector codebook 2606 where a plurality of general samples (code vectors) of a quantization target vector are stored 
should be prepared in advance. This is generally prepared by an LBG algorithm (IEEE TRANSACTIONS ON COMMU- 
NICATIONS, VOL COM-28. NO. 1, PP 84-95, JANUARY 1980) or the like based on multiple vectors which are obtained 
by analyzing multiple speech data. Coefficients for predictive coding should be stored in the predictive coefficients stor- 

55 age section 2604. Those coefficients in use are the same predictive coefficients as stored in the predictive coefficients 
storage section 2505 which has been discussed in (Sixteenth Mode). A value indicating a unvoiced stateshould be 
stored as an initial value in the decoded vector storage section 2603. 

First the parameter calculator 2602 computes parameters necessary fa distance computation from the input per- 
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petual weighted input speech, perpetual weighted LPC synthesis of adaptive code vector and perpetual weighted LPC 
synthesis of random code vector, and further from the decoded vector stored in the decoded vector storage section 
2603 and the predictive coefficients stored in the predictive coefficients storage section 2604. The distances in the dis- 
tance calculator are based on the following equation 46. 

i 

En = £(X/ - Can x At - Gsn x Si) 2 (46) 



10 Can = Om x e x p(Opn) 

Gsn = (1 - Orn) x e x p(Opn) 
Opn = Yp + UpO x Cpn + VpO x Cm 

15 

J J 

Yp = £Upj x pj+^Vpj x rj 

20 J J 

Yr.zutixpj + z v n*n 
/-I /-I 



25 Gan, Gsn: decoded gain 

{Opn, Orn): decoded vector 
(Yp, Yr): predictive vector 

En: coding distortion when the n-th gain code vector is used 
Xi: perpetual weighted input speech 
30 Ai: perpetual weighted LPC synthesis of adaptive code vector 
Si: perpetual weighted LPC synthesis of stochastic code vector 
n: code of the code vector 
i: index of excitation data 

I: subframe length (coding unit of the input speech) 
35 (Cpn, On): code vector 
(pj, rj): old decoded vector 

Upj, Vpj. Urj, Vrj: predictive coefficients (fixed values) 
j: index indicating how old the decoded vector is 
J: prediction order. 

40 

Therefore, the parameter calculator 2602 computes those portions which do not depend on the number of a code 
vector. What is to be computed are the predictive vector, and the correlation among three synthesized speeches or the 
power. An equation tor the computation is given by an equation 47. 

45 J J 

Vp = £ Upj x pj + £ Vpj x rj (47) 
J J 

so Yr = £ Urj x pj + £ Vrj x rj 

/ 

Dxx = £ Xi x Xi 

55 ia0 
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Dxa = £ Xi [ x Ai x 2 
/-o 



Dxs = J X/ x Si x 2 
/ 

Daa = £ A/ x A/ 
Das = £ Ai x s ' x 2 

/oO 



Dss = X 5/ x 5/ 



where (Yp. Yr): predictive vector 

Dxx, Dxa. Dxs, Daa, Das, Dss: value of correction among synthesized speeches or the power 

Xi: perpetual weighted input speech 

Ai: perpetual weighted LPC synthesis of adaptive code vector 

Si: perpetual weighted LPC synthesis of stochastic code vector 

i: index of excitation data 

I: subframe length (coding unit of the input speech) 
(pj, rj): old decoded vector 

Upj, Vpj, Urj, Vrj: predictive coefficients (fixed values) 
j: index indicating how old the decoded vector is 
J: prediction order. 

Then, the distance calculator 2506 computes a distance between a target vector obtained by the target vector 
extracting section 2503 and a code vector stored in the vector codebook 2507 by using the predictive coefficients stored 
in the predictive coeff bents storage section 2505. An equation for computing the distance is given by an equation 42. 

En = Dxx + (Gan) 2 x Daa + (Gsn) 2 x Dss - Gan x Dxa - Gsn x Dxs + Gan x Gsn x Das (48) 

Gan Om « exp(Opn) 

Gsn = (1 - Om) x exp(Opfl) 

Opn = Yp + UpO x Cpn + VpO x Cm 

Orn =Yr+UrO* Cpn+ VrO* Cm 

where En: coding distortion when the n-th gain code vector is used 

Dxx, Dxa. Dxs, Daa, Das. Dss: value of correction among synthesized speeches or the power 

Gan, Gsn: decoded gain 

(Opn. Orn): decoded vector 

(Yp, Yr): predictive vector 

UpO, VpO, UrO, VrO: predictive coefficients (fixed values) 

(Cpn, Crn): code vector 

n: the number of the code vector. 

Actually, Dxx does not depend on the number n of the code vector so that its addition can be omitted. 

Then, the comparator 2607 controls the vector codebook 2606 and the distance calculator 2605 to acquire the 
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number of the code vector which has the shortest distance computed by the distance calculator 2605 from among a 
plurality of code vectors stored in the vector codebook 2606. and sets the number as a gain code 2608. Based on the 
obtained gain code 2608, the comparator 2607 acquires a decoded vector and updates the content of the decoded vec- 
tor storage section 2603 using that vector. A code vector is obtained from the equation 44. 
Further the updating scheme, the equation 44, is used. 

Meanwhile, the speech decoder should previously be provided with a vector codebook, a predictive coefficients 
storage section and a coded vector storage section similar to those of the speech coder, and performs decoding 
through the functions of the comparator of the coder of generating a decoded vector and updating the decoded vector 
storage section, based on the gain code transmitted from the coder. 

According to the thus constituted mode, vector quantization can be performed while evaluating gain-quantization 
originated distortion from two synthesized speeches corresponding to the index of the excitation vector and the input 
speech, the feature of the parameter converting section can permit the use of the correlation between the relative levels 
of the power and each gain, and the features of the decoded vector storage section, the predictive coefficients storage 
section, the target vector extracting section and the distance calculator can ensure predictive coding of gains using the 
correlation between the mutual relations between the power and two gains. This can allow the correlation among 
parameters to be utilized sufficiently. 

(Eighteenth Mode) 

FIG. 27 presents a structural block dagram of the essential portions of a noise canceler according to this mode. 
This noise canceler is installed in the above<lescribed speech coder. Fa example, it is placed at the preceding stage 
of the buffer 1301 in the speech coder shown in FIG. 13. 

The noise canceler shown in FIG. 27 comprises an A/D converter 272, a noise cancellation coefficient storage sec- 
tion 273. a noise cancellation coefficient adjusting section 274. an input waveform setting section 275. an LPC analyz- 
ing section 276, a Fourier transform section 277, a noise canceling/spectrum compensating section 278. a spectrum 
stabilizing section 279. an inverse Fourier transform section 280, a spectrum enhancing section 281, a waveform 
matching section 282. a noise estimating section 284, a noise spectrum storage section 285. a previous spectrum stor- 
age section 286, a random phase stofage section 287, a previous waveform storage section 288, and a maximum 
power storage section 289. 

To begin with, initial settings will be discussed. Table 10 shows the names of fixed parameters and setting exam- 
ples. 



Table 10 



Fixed Parameters 


Setting Examples 


frame length 


160 (20 msec for 8- kHz sampling data) 


pre-read data length 


80 (1 0 msec for the above data) 


FET order 


256 


LPC prediction order 


10 


sustaining number of noise spectrum reference 


30 


designated minimum power 


20.0 


AR enhancement coefficient 0 


0.5 


MA enhancement coefficient 0 


0.8 


high-frequency enhancement coefficient 0 


0.4 


AR enhancement coefficient 1-0 


0.66 


MA enhancement coefficient 1-0 


0.64 


AR enhancement coefficient 1-1 


0.7 


MA enhancement coefficient 1-1 


0.6 


high-frequency enhancement coefficient 1 


0.3 


power enhancement coefficient 


1.2 
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Table 10 (continued) 



Fixed Parameters 


Setting Examples 


noise reference power 


20000.0 


unvoiced segment power reduction coefficient 


0.3 


compensation power increase coefficient 


2.0 


number of consecutive noise references 


5 


• noise cancellation coefficient training coefficient 


0.8 


unvoiced segment detection coefficient 


0.05 


designated noise cancellation coefficient 


1.5 



/5 Phase data for adjusting the phase should have been stored in the random phase storage section 287. Those are 
used to rotate the phase in the spectrum stabilizing section 279. Table 1 1 shows a case where there are eight kinds of 
phase data. 

Table 11 

20 

Phase Data 
(-0.51,0.86), (0.98, -0.17) 
(0.30, 0.95), (-0.53, -0.84) 
25 (-0.94, -0.34), (0.70. 0.71) 

(-0.22, 0.97), (0.38, -0.92) 

30 Further, a counter (random phase counter) for using the phase data should have been stored in the random phase 
storage section 287 too. This value should have been initialized to 0 before storage. 

Next, the static RAM area is set. Specifically, the noise cancellation coefficient storage section 273. the noise spec- 
trum storage section 285, the previous spectrum storage section 286, the previous waveform storage section 288 and 
the maximum power storage section 289 are cleared. The following will discuss the individual storage sections and a 

35 setting example. 

The noise cancellation coefficient storage section 273 is an area for storing a noise cancellation coefficient whose 
initial value stored is 20.0. The noise spectrum storage section 285 is an area for storing, for each frequency, mean 
noise power, a mean noise spectrum, a compensation noise spectrum for the first candidate, a compensation noise 
spectrum for the second candidate, and a frame number (sustaining number) indicating how many frames earlier the 
40 spectrum value of each frequency has changed; a sufficiently large value for the mean noise power, designated mini- 
mum power for the mean noise spectrum, and sufficiently large values for the compensation noise spectra and the sus- 
taining number should be stored as initial values. 

The previous spectrum storage section 286 is an area for storing compensation noise power, power (full range, 
intermediate range) of a previous frame (previous frame power), smoothing power (full range, intermediate range) of a 
45 previous frame (previous smoothing power), and a noise sequence number; a sufficiently large value for the compen- 
sation noise power, 0.0 for both the previous frame power and full frame smoothing power and a noise reference 
sequence number as the noise sequence number should be stored. 

The previous waveform storage section 288 is an area for storing data of the output signal of the previous frame by 
the length of the last pre-read data for matching of the output signal, and all 0 should be stored as an initial value. The 
so spectrum enhancing section 281, which executes ARMA and high-frequency enhancement filtering, should have the 
statuses of the respective filters cleared to 0 for that purpose. The maximum power storage section 289 is an area for 
storing the maximum power of the input signal, and should have 0 stored as the maximum power. 
Then, the noise cancellation algorithm will be explained block by block with reference to FIG. 27. 
First an analog input signal 271 including a speech is subjected to A/D conversion in the A/D converter 272, and 
55 is input by one frame length + pre-read data length (160 + 80 = 240 points in the above setting example). The noise 
cancellation coefficient adjusting section 274 conputes a noise cancellation coefficient and a compensation coefficient 
from an equation 49 based on the noise cancellation coefficient stored in the noise cancellation coefficient storage sec- 
tion 273, a designated noise cancellation coefficient, a learning coefficient for the noise cancellation coefficient, and a 
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compensation power increase coefficient. The obtained noise cancellation coefficient is stored in the noise cancellation 
coefficient storage section 273. the input signal obtained by the A/D converter 272 is sent to the input waveform setting 
section 275, and the compensation coefficient and noise cancellation coefficient are sent to the noise estimating sec- 
tion 284 and the noise canceling/spectrum compensating section 278. 



where q: noise cancellation coefficient 

Q: designated noise cancellation coefficient 

C: learning coefficient for the noise cancellation coefficient 

r: compensation coefficient 

D: compensation power increase coefficient. 

The noise cancellation coefficient is a coefficient indicating a rate of decreasing noise, the designated noise can- 
cellation coefficient is a fixed coefficient previously designated, the learning coefficient for the noise cancellation coef- 
ficient is a coefficient indicating a rate by which the noise cancellation coefficient approaches the designated noise 
cancellation coefficient the compensation coefficient is a coefficient for adjusting the compensation power in the spec- 
trum compensation, and the compensation power increase coefficient is a coefficient for adjusting the compensation 
coefficient. 

In the input waveform setting section 275, the input signal from the A/D converter 272 is written in a memory 
arrangement having a length of 2 to an exponential power from the end in such a way that FFT (Fast Fourier Transform) 
can be carried out. 0 should be filled in the front portion. In the above setting example. 0 is written in 0 to 15 in the 
arrangement with a length of 256. and the input signal is written in 1 6 to 255. This arrangement is used as a real number 
portion in FFT of the eighth order. An arrangement having the same length as the real number portion is prepared for 
an imaginary number portion, and all 0 should be written there. 

In the LPC analyzing section 276, a hamming window is put on the real number area set in the input waveform set- 
ting section 275, autocorrelation analysis is performed on the Hamming-windowed waveform to acquire an autocorre- 
lation value, and autocorrelation-based LPC analysis is performed to acquire linear predictive coefficients. Further, the 
obtained linear predictive coefficients are sent to the spectrum enhancing section 281. 

The Fourier transform section 277 conducts discrete Fourier transform by FFT using the memory arrangement of 
the real number portion and the imaginary number portion, obtained by the input waveform setting section 275. The 
sum of the absolute values of the real number portion and the imaginary number portion of the obtained complex spec- 
trum is computed to acquire the pseudo amplitude spectrum (input spectrum hereinafter) of the input signal. Further, 
the total sum of the input spectrum value of each frequency (input power hereinafter) is obtained and sent to the noise 
estimating section 284. The complex spectrum itself is sent to the spectrum stabilizing section 279. 

A process in the noise estimating section 284 will now be discussed. 

The noise estimating section 284 compares the input power obtained by the Fourier transform section 277 with the 
maximum power value stored in the maximum power storage section 289, and stores the maximum power value as the 
irput power value in the maximum power storage section 289 when the maximum power is smaller. If at least one ol the 
following cases is satisfied, noise estimation is performed, and if none of them are met, noise estimation is not carried 
out. 

(1) The input power is smaller than the maximum power multiplied by an unvoiced segment detection coefficient. 

(2) The noise cancellation coefficient is larger than the designated noise cancellation coefficient plus 0.2. 

(3) The input power is smaller than a value obtained by multiplying the mean noise power, obtained from the noise 
spectrum storage section 285, by 1.6. 

The noise estimating algorithm in the noise estimating section 284 will now be discussed. 

First, the sustaining numbers of all the frequencies for the first and second candidates stored in the noise spectrum 
storage section 285 are updated (incremented by 1). Then, the sustaining number of each frequency for the first candi- 
date is checked, and when it is larger than a previously set sustaining number of noise spectrum reference, the com- 
pensation spectrum and sustaining number for the second candidate are set as those for the first candidate, and the 
compensation spectrum of the second candidate is set as that of the third candidate and the sustaining number is set 
to 0. Note that in replacement of the compensation spectrum of the second candidate, the memory can be saved by not 
storing the third candidate and substituting a value slightly larger than the second candidate. In this mode, a spectrum 
which is 1 .4 times greater than the compensation spectrum of the second candidate is substituted. 



q =qxc + Qx(i -C) 



(49) 



r = Q/qxD 
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After renewing the sustaining number, the compensation noise spectrum is compared with the input spectrum for 
each frequency. First, the input spectrum of each frequency is compared with the compensation nose spectrum of the 
first candidate, and when the input spectrum is smaller, the compensation noise spectrum and sustaining number for 
the first candidate are set as those for the second canoldate, and the input spectrum is set as the compensation spec- 

5 trum of the first candidate with the sustaining number set to 0. In other cases than the mentioned condition, the input 
spectrum is compared with the compensation nose spectrum of the second candidate, and when the input spectrum is 
smaller, the input spectrum is set as the compensation spectrum of the second candidate with the sustaining number 
set to 0. Then, the obtained compensation spectra and sustaining numbers of the first and second candidates are 
stored in the noise spectrum storage section 285. At the same time, the mean noise spectrum is updated according to 

jo the following equation 50. 

Si = Si*g + Six(1 -g) (50) 

where s: means noise spectrum 
is S: input spectrum 

g: 0.9 (when the input power is larger than a half the mean noise power) 

0.5 (when the input power is equal to or smaller than a half the mean noise power) 

i: number of the frequency. 

20 The mean noise spectrum is pseudo mean noise spectrum, and the coefficient g in the equation 50 is for adjusting 
the speed of learning the mean noise spectrum. That is, the coefficient has such an effect that when the input power is 
smaller than the noise power, it is likely to be a noise-onry segment so that the learning speed will be increased, and 
otherwise, it is likely to be in a speech segment so that the learning speed will be reduced. 

Then, the total of the values of the individual frequencies of the mean noise spectrum is obtained to be the mean 

25 noise power. The compensation noise spectrum, mean noise spectrum and mean noise power are stored in the noise 
spectrum storage section 285. 

In the above noise estimating process, the capacity of the RAM constituting the noise spectrum storage section 285 
can be saved by making a noise spectrum of one frequency correspond to the input spectra of a plurality of frequencies. 
As one example is illustrated the RAM capacity of the noise spectrum storage section 285 at the time of estimating a 

30 noise spectrum of one frequency from the input spectra of four frequencies with FFT of 256 points in this mode used. 
In consideration of the (pseudo) amplitude spectrum being horizontally symmetrical with respect to the frequency axis, 
to make estimation for all the frequencies, spectra of 128 frequencies and 128 sustaining numbers are stored, thus 
requiring the RAM capacity of a total of 768 W or 128 (frequencies) x2 (spectrum and sustaining number) x 3 (first and 
second candidates for compensation and mean), 

35 When a noise spectrum of one frequency is made to correspond to input spectra of four frequencies, by contrast, 
the required RAM capacity is a total of 192 W or 32 (frequencies) x2 (spectrum and sustaining number) x3 (first and 
second candidates for compensation and mean) . In this case, it has been confirmed through experiments that for the 
above 1 x4 case, the performance is hardly deteriorated while the frequency resolution of the noise spectrum 
decreases. Because this means is not for estimation of a noise spectrum from a spectrum of one frequency, it has an 

40 effect of preventing the spectrum from being erroneous estimated as a noise spectrum when a normal sound (sine 
wave, vowel or the like) continues for a long period of time. 

A description will now be given of a process in the noise canceling/spectrum compensating section 278. 
A result of multiplying the mean noise spectrum, stored in the noise spectrum storage section 285, by the noise 
cancellation coefficient obtained by the noise cancellation coefficient adjusting section 274 is subtracted from the input 

45 spectrum (spectrum difference hereinafter). When the RAM capacity of the noise spectrum storage section 285 is 
saved as described in the explanation of the noise estimating section 284, a result of multiplying a mean noise spectrum 
of a frequency corresponding to the input spectrum by the noise cancellation coefficient. is subtracted. When the spec- 
trum difference becomes negative, compensation is carried out by setting a value obtained by multiplying the first can- 
didate of the compensation noise spectrum stored in the noise spectrum storage section 285 by the compensation 

so coefficient obtained by the noise cancellation coefficient adjusting section 274. This is performed for every frequency. 
Further, flag data is prepared for each frequency so that the frequency by which the spectrum difference has been com- 
pensated can be grasped. For example, there is one area tor each frequency, and 0 is set in case of no compensation, 
and 1 is set when compensation has been carried out. This flag data is sent together with the spectrum difference to 
the spectrum stabilizing section 279. Furthermore, the total number of the compensated (compensation number) is 

55 acquired by checking the values of the flag data, and it is sent to the spectrum stabilizing section 279 too. 

A process in the spectrum stabilizing section 279 will be discussed below. This process serves to reduce ailcphone 
feeling mainly of a segment which does not contain speeches. 

First the sum of the spectrum differences of the individual frequencies obtained from the noise canceling/spectrum 
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compensating section 278 is computed to obtain two kinds of current frame powers, one for the full range and the other 
for the intermediate range. For the full range, the current frame power is obtained for all the frequencies (called the full 
range; 0 to 128 in this mode). For the intermediate range, the current frame power is obtained for an perpetually impor- 
tant, intermediate band (called the intermedate range; 16 to 79 in this mode). 
5 Likewise, the sum of the compensation noise spectra for the first candidate, stored in the noise spectrum storage 
section 285, is acquired as current frame noise power (full range, intermediate range). When the values of the compen- 
sation numbers obtained from the noise canceling/spectrum compensating section 278 are checked and are sufficientfy 
large, and when at least one of the following three conditions is met, the current frame is determined as a noise-only 
segment and a spectrum stabilizing process is performed. 

10 

(1) The input power is smaller than the maximum power multiplied by an unvoiced segment detection coefficient. 

(2) The cunent frame power (intermediate range) is smaller than the current frame noise power (intermediate 
range) multiplied by 5.0. 

(3) The input power is smaller than noise reference power 

is 

In a case where no stabilizing process is not conducted, the consecutive noise number stored in the previous spec- 
trum storage section 286 is decremented by 1 when it is positive, and the current frame noise power (full range, inter- 
mediate range) is set as the previous frame power (full range, irrtermecSate range) and they are stored in the previous 
spectrum storage section 286 before proceeding to the phase diffusion process. 
20 The spectrum stabilizing process will now be discussed. The purpose for this process is to stabilize the spectrum 
in an unvoiced segment (speech-less and noise-only segment) and reduce the power. There are two kinds of proc- 
esses, and a process 1 is performed when the consecutive noise number is smaller than the number of consecutive 
noise references while a process 2 is performed otherwise. The two processes will be described as follow. 

25 (Process 1) 

The consecutive noise number stored in the previous spectrum storage section 286 is incremented by 1 , and the 
current frame noise power (full range, intermediate range) is set as the previous frame power (full range, intermediate 
range) and they are stored in the previous spectrum storage section 286 before proceeding to the phase adjusting proc- 
30 ess. 

(Process 2) 

The previous frame power, the previous frame smoothing power and the unvoiced segment power reduction coef- 
35 ficient stored in the previous spectrum storage section 286, are referred to and are changed according to an equation 
51. 

Dd80 = Dd80x0.8 + A80x0.2*P (51) 
40 D80 = D80x0.5 + Dd80x0.5 

Dd129 = Dd129x0.8 + A129x0.2xP 
D129 = 0129x0.5 + Dd129x0.5 

45 

where Dd80: previous frame smoothing power (intermediate range) 
D80: previous frame power (intermediate range) 
Dd129: previous frame smoothing power (full range) 
D1 29: previous frame power (full range) 
so A80: current frame noise power (intermediate range) 
A129: current frame noise power (full range). 

Then, those powers are reflected on the spectrum differences. Therefore, two coefficients, one to be multiplied in 
the intermediate range (coefficient 1 hereinafter) and the other to be multiplied in the full range (coefficient 2 hereinaf- 
55 ter), are computed. First, the coefficient 1 is computed from an equation 52. 

r1 = D80/A80 (when A80 > 0) 1 .0 (when A80 5 0) (52) 
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where r1: coefficient 1 

D80: previous frame power (intermediate range) 
A80: current frame noise power (intermediate range). 

5 As the coefficient 2 is influenced by the coefficient 1 , acquisition means becomes slightly complicated. The proce- 
dures will be illustrated below. 

(1) When the previous frame smoothing power (full range) is smaller than the previous frame power (intermediate 
range) or when the current frame noise power (full range) is smaller than the current frame, noise power (interme- 

10 diate range), the flow goes to (2), but goes to (3) otherwise. 

(2) Trie coefficient 2 is set to 0.0, and the previous frame power (full range) is set as the previous frame power 
(intermediate range), then the flow goes to (6). 

(3) When the current frame noise power (full range) is equal to the current frame noise power (intermediate range), 
the flow goes to (4), but goes to (5) otherwise. 

is (4) The coefficient 2 is set to 1.0, and then the flow goes to (6). 

(5) The coefficient 2 is acquired from the following equation 53, and then the flow goes to (5). 

r2 = (D129 - D80)/(A129 - A80) (53) 

20 where r2: coefficient 2 

D 1 29: previous frame power (full range) 

D80: previous frame power (intermediate range) 

A 129: current frame noise power (full range) 

A80: current frame noise power (intermediate range). 

25 

(6) The computation of the coefficient 2 is terminated 

The coefficients 1 and 2 obtained in the above algorithm always have their upper limits clipped to 1 .0 and lower lim- 
its to the unvoiced segment power reduction coefficient A value obtained by multiplying the spectrum difference of the 
30 intermediate frequency (16 to 79 in this example) by the coefficient 1 is set as a spectrum difference, and a value 
obtained by multiplying the spectrum difference of the frequency excluding the intermediate range from the full range of 
that spectrum difference (0 to 15 and 80 to 128 in this example) by the coefficient 2 is set as a spectrum difference. 
Accordingly, the previous frame power (full range, intermediate range) is converted by the following equation 54. 

35 D80 => A80xr1 (54) 

D129 = D80 + (A129 - A80)xr2 

where r1 : coefficient 1 
40 r2: coefficient 2 

D80: previous frame power (intermediate range) 
A80: current frame noise power (intermediate range) 
D129: previous frame power (full range) 
A129: current frame noise power (full range). 

45 

Various sorts of power data. etc. obtained in this manner are all stored in the previous spectrum storage section 
286 and the process 2 is then terminated. 

The spectrum stabilization by the spectrum stabilizing section 279 is carried out in the above manner. 

Next, the phase adjusting process will be explained. While the phase is not changed in principle in the conventional 
so spectrum subtraction, a process of altering the phase at random is executed when the spectrum of that frequency is 
compensated at the time of cancellation. This process enhances the randomness of the remaining noise, yielding such 
an effect of making is difficult to give a perpetually adverse impression. 

First the random phase counter stored in the random phase storage section 287 is obtained. Then, the flag data 
(indicating the presence/absence of compensation) of all the frequencies are referred to, and the phase of the complex 
55 spectrum obtained by the Fourier transform section 277 is rotated using the following equation 55 when compensation 
has been performed. 

Bs = SixRc-TixRc + 1 (55) 
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Bt = SixRc + 1 +TixRc 
SUBs 

5 Ti = Bt 

where Si. Ti: complex spectaim 
i: index indicating the frequency 
R: random phase data 
10 c: random phase counter 

Bs, Bt: register for computation. 

In the equation 55, two random phase data are used in pair. Every time the process is performed once, the random 
phase counter is incremented by 2, and is set to 0 when it reaches the upper limit (16 in this mode). The random phase 
is counter is stored in the random phase storage section 287 and the acquired complex spectrum is sent to the inverse 
Fourier transform section 280. Further, the total of the spectrum differences (spectrum difference power hereinafter) 
and it is sent to the spectrum enhancing section 281 . 

The inverse Fourier transform section 280 constructs a new complex spectrum based on the amplitude of the spec- 
trum difference and the phase of the complex spectrum, obtained by the spectrum stabilizing section 279, and carries 
20 out inverse Fourier transform using FFT. (The yielded signal is called a first order output signal.) The obtained first order 
output signal is sent to the spectrum enhancing section 281 . 

Next, a process in the spectrum enhancing section 281 will be dscussed. 

First, the mean noise power stored in the noise spectrum storage section 285, the spectrum difference power 
obtained by the spectrum stabilizing section 279 and the noise reference power, which is constant, are referred to select 
25 an MA enhancement coefficient and AR enhancement coefficient. The selection is implemented by evaluating the fol- 
lowing two conditions. 

(Condition 1) 

30 The spectrum difference power is greater than a value obtained by multiplying the mean noise power, stored in the 
noise spectrum storage section 285, by 0.6, and the mean noise power is greater than the noise reference power. 

(Condition 2) 

35 The spectrum difference power is greater than the mean noise power. 

When the condition 1 is met, this segment is a "voiced segment, " the MA enhancement coefficient is set to an MA 
enhancement coefficient 1-1 . the AR enhancement coefl idem is set to an AR enhancement coefficient 1 • 1 , and a high- 
frequency enhancement coefficient is set to a high-frequency enhancement coefficient 1 . When the condition 1 is not 
satisfied but the condition 2 is met, this segment is an "unvoiced segment" the MA enhancement coefficient is set to 

40 an MA enhancement coefficient 1 -0, the AR enhancement coefficient is set to an AR enhancement coefficient 1 -0, and 
the high-frequency enhancement coefficient is set to 0. When the condition 1 is satisfied but the condition 2 is not, this 
segment is an "unvoiced, noise-only segment," the MA enhancement coefficient is set to an MA enhancement coeffi- 
cient 0, the AR enhancement coefficient is set to an AR enhancement coefficient 0, and the high-frequency enhance- 
ment coefficient is set to a high-frequency enhancement coefficient 0. 

45 Using the linear predictive coefficients obtained from the LPC analyzing section 276, the MA enhancement coeffi- 
cient and the AR enhancement coefficient, an MA coefficient AR coefficient of an extreme enhancement filter are com- 
puted based on the following equation 56. 

a(ma)i = aixfl* (56) 

so 

a(ar)i = aixy' 

where a(ma)i: MA coefficient 
a(ar)i: AR coefficient 
55 at: linear predictive coefficient 
(3: MA enhancement coefficient 
y. AR enhancement coefficient 
i: number. 
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Then, the first order output signal acquired by the inverse Fourier transform section 280 is put through the extreme 
enhancement filter using the MA coefficient and AR coefficient. The transfer function of this fiter is given by the follow- 
ing equation 57. 



1 + a{ma)^ x Z~ 1 + g(ma) 2 x Z' 2 + • ■ ■ +a{ma)j x Z" J 
1 +a{ar) 1 x Z* 1 +a(ar) 2 x Z* 2 +• • •+a(ar) / x Z" y 



(57) 



where a(ma)j: MA coefficient 
afar)^ AR coefficient 
j: order. 



Further, to enhance the high frequency component, high-frequency enhancement filtering is performed by using 
the high-frequency enhancement coefficient. The transfer function of this filter is given by the following equation 58. 

1-SZ' 1 (58) 



where 8: high-frequency enhancement coefficient. 

20 

A signal obtained through the above process is called a second order output signal. The filter status is saved in the 
spectrum enhancing section 281 . 

Finally, the waveform matching section 282 makes the second order output signal, obtained by the spectrum 
enhancing section 281, and the signal stored in the previous waveform storage section 288, overlap one on the other 
25 with a triangular window. Further, data of this output signal by the length of the last pre-read data is stored in the previ- 
ous waveform storage section 288. A matching scheme at this time is shown by the following equation 59. 



O i = (jxD i + (L-j)xZ j )/L(j = 0-~L-1) 



(59) 



30 



(j = 0~L-1) 



where Oj: output signal 
35 Dj: second order output signal 
If output signal 
L: pre-read data length 
M: frame length. 



40 It is to be noted that while data of the pre-read data length + frame length is output as the output signal, that of the' 
output signal which can be handled as a signal is only a segment of the frame length from the beginning of the data. 
This is because, later data of the pre-read data length will be rewritten when the next output signal is output. Because 
continuity is compensated in the entire segments of the output signal, however, the data can be used in frequency anal- 
ysis, such as LPC analysis or filter analysis. 

45 According to this mode, noise spectrum estimation can be conducted for a segment outside a voiced segment as 
well as in a voiced segment, so that a noise spectrum can be estimated even when it is not clear at which timing a 
speech is present in data. 

It is possible to enhance the characteristic of the input spectrum envelope with the linear precfi ctive coefficients, and 
to possfole to prevent degradation of the sound quality even when the noise level is high. 
so Further, using the mean spectrum of noise can cancel the noise spectrum more significantly. Further, separate esti- 
mation of the compensation spectrum can ensure more accurate compensation. 

It is possible to smooth a spectrum in a noise-only segment where no speech is contained, and the spectrum in this 
segment can prevent allophone feeling from being caused by an extreme spectrum variation which is originated from 
noise cancellation. 

55 The phase of the compensated frequency component can be given a random property, so that noise remaining 
uncanceled can be converted to noise which gives less perpetual allophone feeling. 

The proper weighting can perpetually be given in a voiced segment, and perpetual-weighting originating allophone 
feeling can be suppressed in an unvoiced segment or an unvoiced syllable segment 
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Industrial Applicability 

As apparent from the above, an excitation vector generator, a speech coder and speech decoder according to this 
invention are effective in searching for excitation vectors and are suitable for improving the speech quality. 

5 

Claims 

1 . An excitation vector generator comprising: 

io seed storage means for storing a plurality of seeds; 

an oscillator for outputting different vector streams in accordance with values of seeds; and 
switch means for switching a seed to be supplied to said oscillator from said seed storage means. 

2. The excitation vector generator according to claim 1, wherein said oscillator is a non-linear oscillator. 

15 

3. The excitation vector generator according to claim 2, wherein said non-linear oscillator is a non-linear digital filter. 

4. The excitation vector generator according to claim 3, wherein said non-linear digital fitter includes an adder having 
a non-linear adder characteristic, a plurality of filter state holcfing sections to which an output of said adder is 

* 20 sequentially transferred as a filter state, and a plurality of multipliers for multiplying a filter state, output from each 
of said filter state holding sections, by a gain and sending a multiplication value to said adder; 

seeds read from said seed storage means are supplied to said filter state holding sections as initial values of 
said filter states; 

25 said adder have an externally supplied vector stream and said multiplication values output from said multipliers 

as input values and produces an adder output according to said non-linear adder characteristic with respect to 
a sum of said input values; and 

said gains of said multipliers are fixed in such a way that poles of said digital filter lie outside a unit circuit on a 
Z plane. 

30 

5. The excitation vector generator according to claim 4, wherein said non-linear digital filter has a second-order all- 
pole model where said filter state holding sections are arranged in two stages and said multipliers are connected 
in parallel to outputs of said filter state holding sections; and 

35 said non-linear adder characteristic of said adder is a 2's complement characteristic. 

6. An excitation vector generator comprising: 

excitation vector storage means for storing old excitation vectors; 
40 excitation vector processing means for performing different processes on one or a plurality of old excitation 

vectors, read from said excitation vector storage means, in accordance with externally supplied indices, 
thereby generating new random excitation vectors; and 

switch means for switching indices to be supplied to said excitation vector processing means. 

45 7. The excitation vector generator according to daim 6, wherein said excitation vector processing means includes: 

means for determining process contents to be applied to old excitation vectors in accordance with said indices; 
and 

a plurality of processing sections for sequentially performing processes according to said determined process 
so contents on old excitation vectors read from said excitation vector storage means. 

8. The excitation vector generator according to claim 7, wherein said plurality of processing sections include sections 
selected from the group of consisting of reading section for performing a process of reading element vectors of dif- 
ferent lengths from different positions in said excitation vector storage means, a reversing section for performing a 
55 process of sorting a plurality of vectors after said reading process in a reverse order, a multiplying section tor per- 
forming a process of multiplying said plurality of vectors after said reversing process by cfifferent gains, a decimating 
section for performing a process of shortening vector lengths of said predetermined vectors after said multiplying 
process, an interpolating section for performing a process of lengthening vector lengths of said plurality of vectors 



51 



EP 0 883 107 A1 

after said decimating process and an adding process for performing a process of adding said plurality of vectors 
after said interpolating jwocess. 

9. An excitation vector generator comprising: 

5 

fixed waveform storage means for storing a plurality of fixed waveforms; 

fixed waveform arranging means for arranging said fixed waveforms read from said fixed waveform storage 
means, at respective arbitrary start positions; and 

adding means for adding said fixed waveforms arranged by said fixed waveform arranging means to generate 
w an excitation vector. 

10. The excitation vector generator according to claim 9, wherein said fixed waveform arranging means has: 

a table where information of a plurality of start position candidates for start positions of each of said fixed wave- 
is forms ts, registered for each fixed waveform; 

means for selecting said start position of said each fixed waveform from a plurality of start position candidates 
in said table based on combination information on said start positions of said fixed waveforms; and 
means for arranging said each fixed waveform at said selected start position. 

20 11. The excitation vector generator according to claim 9. wherein said fixed waveform arranging means algebraically 
produces said start position candidate information of said each fixed waveform. 

12. A speech coder comprising: 

25 seed storage means for storing a plurality of seeds; 

an oscillator for outputting a vector stream in accordance with a value of seed; 

a synthesis filter for performing LPC synthesis on said vector stream output from said oscillator as an excitation 
vector to thereby produce a synthesized speech; and 

searching means for measuring cfistortion of a synthesized speech produced in association with each seed 
30 and specifying a seed number to maximize a measured value while switching a seed to be supplied to said 

oscillator from said seed storage means. 

13. The speech coder according to claim 12, wherein said oscillator is a non-linear digital filter. 

35 14. The speech coder according to claim 13, wherein said non-linear digital filter includes an adder having a non-linear 
adder characteristic, a plurality of filter state holding sections to which an output of said adder is sequentially trans- 
ferred as a filter state, and a plurality of multipliers for multiplying a filter state, output from each of said filter state 
holding sections, by a gain and sending a multiplication value to said adder; 

40 seeds read from said seed storage means are supplied to said filter state holding sections as initial values of 

said filter states; 

said adder have an externally supplied vector stream and said multiplication values output from said multipliers 
as input values and produces an adder output according to said non-linear adder characteristic with respect to 
a sum of said input values; and 
45 said gains of said multipliers are fixed in such a way that poles of said digital filter lie outside a unit circuit on a 

Z plane. 

15. The speech coder according to claim 12, further comprising: 

so a buffer for storing an input speech signal to be coded; 

an LPC analyzing means for performing linear predictive analysis on a processing frame in said buffer to 
acquire linear predictive coefficients (LPCs) and converting said acquired linear predictive coefficients to a line 
spectrum pair (LSP); 

LSP adding means for additionally generating a plurality of line spectrum pairs in addition to said line spectrum 
55 pair associated with said processing frame, generated by said LPC analyzing means; 

quantizing/decoding means for performing quantization/decoding on all of said line spectrum pairs generated 
by said LPC analyzing means and said LSP adding means, thereby generating decoded LSPs for ail of said 
line spectrum pairs; 



52 



EP 0 883 107 A1 



means for selecting a decoded LSP to minimize an ailophone from said plurality of decoded LSPs; and 
means for cooing said selected, decoded LSP 

1 6. The speech coder according to claim 15. wherein said LPC analyzing means performs linear predictive analysis on 
a pre-read area in said buffer to acquire linear preolctive coefficients for said pre-read area and generating a line 
spectrum pair fa said pre-read area from said acquired linear predictive coefficients; and 

said LSP adding means performs linear interpolation on said line spectrum pair of said processing frame and 
said line spectrum pair for said pre-read area to add a plurality of line spectrum pairs to be quantized. 

1 7. The speech coder according to daim 1 6, wherein said quantizing/decoding means includes: 

a quantization table fa converting a line spectrum pair to a code vector by performing vector quantization on 
said line spectrum pair; 

LSP quantizing means for reading a code vector corresponding to a line spectrum pair to be quantized from 
said quantization table to generate a vector quantized LSP; 

LSP decoding means for decoding said vector quantized LSP generated by said LSP quantizing means to gen- 
erate a decoded LSP; 

multiplying means for multiplying a code vector read from said quantization table with a gain; and 

means for adaptively adjusting said gain of said multiplying means based on a level of a gain of said multiplying 

means used for a previous frame and a size of an LSP quantization error in said LSP quantizing means. 

18. A speech coder comprising: 

excitation vector storage means for storing old excitation vectors; 

excitation vector processing means fa perfaming different processes on one or a plurality of old excitation 
vectors, read from said excitation vector storage means, in accordance with indices, thereby generating new 
random excitation vectors; 

a synthesis filter for performing LPC synthesis on said excitation vectors output from said excitation vector 
processing means to thereby produce a synthesized speech; and 

searching means for measuring distortion of a synthesized speech produced in association with each index 
and specifying an index number to maximize a measured value while switching indices to be supplied to said 
excitation vector processing means. 

19. The speech coder according to daim 18, wherein said excitation vector processing means includes: 

means for determining process contents to be applied to old excitation vectors in accordance with said indices; 
and 

a plurality of processing sections for sequentially performing processes according to said determined process 
contents on old excitation vectors read from said excitation vector storage means. 

20. A CELP type speech coder comprising: 

an adaptive codebook for storing immediately previous excitation vector information as an adaptive code vec- 
tor; 

a random codebook for generating a random code vector; and 

a synthesis filter fa performing LPC synthesis of said adaptive code vector and said random code vector, 
said random codebook being constituted of a excitation vecta generator comprising seed storage means for 
storing a plurality of seeds, an oscillator for outputting different vector streams in accordance with values of 
seeds, and switch means for switching a seed to be supplied to said oscillator from said seed storage means. 

21. A speech coder comprising: 

an excitation vecta generator having fixed waveform storage means for storing a plurality of fixed waveforms, 
fixed waveform arranging means for arranging said fixed waveforms read from said fixed waveform staage 
means, at respective arbitrary start positions, and adding means for adding said fixed waveforms arranged by 
said fixed waveform arranging means to generate an excitation vector; 

a synthesis filter tor synthesizing excitation vectors output from said adding means to produce a synthesized 
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speech; and 

searching means for measuring distortion of a synthesized speech produced in association with each combi- 
nation of said start positions to specify a combination of said start positions to maximize a measured value 
white instructing a combination of said start positions to said fixed waveform arranging means. 

22. The speech coder according to claim 21, wherein a code number associated with said combination of said start 
positions specified by said searching means is transfened as speech information. 

23. The speech coder according to claim 21 , wherein said fixed waveform arranging means algebraically produces said 
start position candidate information of said each fixed waveform. 

24. A CELP type speech coder comprising: 

an adaptive codebook for storing immediately previous excitation vector information into an adaptive codebook 
a random codebook for generating a random code vector; and 

a synthesis filter for performing LPC synthesis of said adaptive code vector and said random code vector, 
said random codebook being constituted of a excitation vector generator comprising fixed waveform storage 
means for storing a plurality of fixed waveforms, fixed waveform arranging means for arranging said fixed wave- 
forms read from said fixed waveform storage means, at respective arbitrary start positions, and adding, means 
for adding said fixed waveforms arranged by said fixed waveform arranging means to generate an excitation 
vector. 

25. The CELP type speech coder according to daim 24, wherein said fixed waveform storage means stores a fixed 
waveform reflecting a result of analyzing statistical characteristic of a target signal to be used in an excitation vector 
search of said random codebook. 

26. The CELP type speech coder according to claim 25, wherein said fixed waveform storage means stores a fixed 
waveform obtained through training with an evaluation equation for use in searching said random codebook being 
a cost function. 

27. The CELP type speech coder according to claim 24, further comprising: 

a second random codebook for generating random code vectors; and 

selecting means for selecting one of said random codebook and said second random codebook. 

28. The CELP type speech coder according to claim 27, wherein said second random codebook is vector storage 
means having a plurality of random number streams stored. 

29. The CELP type speech coder according to claim 27, wherein said second random codebook is pulse stream stor- 
age means having a plurality of pulse streams stored. 

30. The CELP type speech coder according to claim 27, wherein said second random codebook has a same structure 
as said excitation vector generator, and a number of fixed waveforms to be stored in said fixed waveform storage 
means differs from said random codebook. 

31. The CELP type speech coder according to claim 27, wherein said selecting means. selects the random codebook 
from which an excitation vector to minimize coding distortion has been detected as a result of performing excitation 
vector search on said random codebook. 

32. The CELP type speech coder according to claim 27, wherein said selecting means adaptively selects one of said 
random codebooks from a result of analyzing speech segments. 

33. The CELP type speech coder according to claim 32, wherein said result of analyzing speech segments is a judging 
parameter determined before searching said random codebooks. 

34. The CELP type speech coder according to claim 33, wherein said selecting means has a pitch gain quantizer for 
quantizing a pitch gain of an adaptive code vector to generate a quantized pitch gain, and selects a random code* 
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book in accordance with a level of said quantized pitch gain with said quantized pitch gain taken as a judging 
parameter. 

35. The CELP type speech coder according to claim 33. wherein said selecting means has a pitch period calculator for 
calculating a pitch period of an adaptive code vector, and selects a random codebook in accordance with said pitch 
period with said pitch period taken as a transfer parameter. 

36. A CELP type speech coder comprising: . 

fixed waveform storage means for storing a plurality of fixed waveforms; 

fixed waveform arrangng means having start position candidate information for each of said fixed waveforms 
stored in said fixed waveform storage means; 

impulse generating means for generating an irrpulse for said start position candidate information of said fixed 
waveform arranging means; 

impulse response calculating means for each waveform for generating an impulse response for each waveform 

by convoluting an impulse response of a synthesis filter for producing a synthesized speech from excitation 

vectors and each fixed waveform stored in said fixed waveform storage means; and 

conelation matrix calculating means for calculating autocorrelations and a relative correlations of said impulse 

response for each waveform and mapping said autocorrelations and said relative correlations in a correlation 

matrix. 

37. A speech coder comprising: 

seed storage means for storing a plurality of seeds; 

an oscillator for outputting a vector stream in accordance with a value of seed; 

a synthesis filter for performing LPC synthesis on said vector stream output from said oscillator as an excitation 
vector to thereby produce a synthesized speech; 

means for measuring distortion of a synthesized speech produced in association with each seed and specify- 
ing a seed number to maximize a measured value while switching a seed to be supplied to said oscillator from 
said seed storage means; 

means for acquiring an optimal gain of a synthesized speech produced in association with said specified seed 
number; and 

vector quantizing means for performing vector quantization of said optimal gain. 

38. The speech coder according to claim 37, wherein said vector quantizing means comprises: 

parameter converting means for converting two gain information of a CELP type with said optimal gain being 
a code vector of one cf said gain information, an adaptive code vector gain and a random code vector gain to 
a sum thereof and a ratio to said sum to thereby acquire a target vector for quantization; 
decoded vector storage means for storing a decoded code vector; 
predictive coefficients storage means for storing predictive coefficients; 

target extracting means for acquiring a target vector by using said target vector for quantization, said decoded 

code vector, and said predictive coefficients; 

a vector codebook for storing a plurality o1 code vectors; 

distance calculating means for calculating distances between said plurality of code vectors and said target vec- 
tor by using said predictive coefficients; and 

comparing means for comparing said distances with one another to acquire an optimal code vector and a cor- 
responding number by controlling said vector codebook and said distance calculating means, outputting said 
number as a code, and updating said decoded code vector using said optimal code vector. 

39. The speech coder according to claim 38, wherein said predictive coefficients are set in accordance with a degree 
of correlation between a sum and a ratio to said sum. 

40. A speech coder comprising: 

an excitation vector generator having fixed waveform storage means for storing a plurality of fixed waveforms, 
fixed waveform arranging means for arranging said fixed waveforms read from said fixed waveform storage 
means, at respective arbitrary start positions, and adding means for adding said fixed waveforms arranged by ■ 
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said fixed waveform arranging means to generate an excitation vector; 

a synthesis filter for synthesizing excitation vectors output from said adding means to produce a synthesized 
speech; 

means for measuring distortion of a synthesized speech produced in association with each combination of said 
start positions to specify a combination of said start positions to maximize a measured value while instructing 
a combination of said start positions to said fixed waveform arranging means. 

means for acquiring an optimal gain of a synthesized speech produced in association with said specified com- 
bination of said start positions; and 

vector quantizing means for performing vector quantization of said optimal gain. 

41. The speech coder according to claim 40. wherein said vector quantizing means comprises: 

parameter converting means for converting two gain information of a CELP type with said optimal gain being 
a code vector of one of said gain information, an adaptive code vector gain and a random code vector gain to 
a sum thereof and a ratio to said sum to thereby acquire a target vector for quantization; 
decoded vector storage means tor storing a decoded code vector; 

predictive coefficients storage means for storing predictive coefficients; ^ 

target extracting means for acquiring a target vector by using said target vector for quantization, said decoded 

code vector, and said predictive coefficients; 

a vector codebook for storing a plurality of code vectors; 

distance calculating means for calculating distances between said plurality of code vectors and said target vec- 
tor by using said predictive coefficients; and 

comparing means for comparing said distances with one another to acquire an optimal code vector and a cor- 
responding number by controlling said vector codebook and said distance calculating means, outputting said 
number as a code, and updating said decoded code vector using said optimal code vector. 

42. The speech coder according to claim 41, wherein said predictive coefficients are set in accordance with a degree 
of correlation between a sum and a ratio to said sum. 

43. A speech coder comprising: 

seed storage means for storing a plurality of seeds; 

a synthesis filter for performing LPC synthesis on said vector stream output from said oscillator as an excitation 
vector to thereby produce a synthesized speech; 

means for measuring distortion of a synthesized speech produced in association with each seed and specify- 
ing a seed number to maximize a measured value while switching a seed to be supplied to said oscillator from 
said seed storage means; and 

a noise cancel er for removing a noise component from an input speech signal. 

44. The speech coder according to claim 43, wherein said noise canceler comprises: 

A/D converting means for converting said input speech signal to a digital signal; 

noise cancellation coefficient adjusting means for adjusting a noise cancellation coefficient for determining an 
amount of noise cancellation; 

LPC analyzing means for performing linear predictive analysis on a digital signal of a given time length, 
obtained by said A/D converting means; 

Fourier transform means for performing discrete Fourier transform on said digital signal of a given time length, 
obtained by said A/D converting means to acquire an input spectrum and a complex spectrum; 
noise spectrum storage means for storing an estimated noise spectrum; 

noise estimating means for estimating a spectrum of noise by comparing said input spectrum obtained by said 
Fourier transform means with a noise spectrum stored in said noise spectrum storage means, and storing an 
acquired noise spectrum in said noise spectrum storage means; 

noise canceling/spectrum compensating means tor subtracting said noise spectrum stored in said noise spec- 
trum storage means from said input spectrum obtained by said Fourier transform means based on a coefficient 
acquired by said noise cancellation coefficient adjusting means, checking an obtained spectrum and compen- 
sating for a spectrum of an oven-educed frequency; 

spectrum stabilizing means for stabilizing said spectrum obtained by said noise canceling/spectrum compen- 
sating means and adjusting, of phases of said complex spectrum obtained by said Fourier transform means, a 
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phase of said frequency compensated by said noise canceling/spectrum compensating means; 
inverse Fourier transform means for performing inverse Fourier transform based on said spectrum stabilized 
by said spectrum stabilizing means and said phase spectrum adjusted by said spectrum stabilizing means; 
spectrum enhancing means for performing spectrum enhancement on a signal obtained by said inverse Fou- 
5 rier transform means; and 

waveform matching means for matching a signal obtained by said spectrum enhancing means with a signal of 
a previous frame. 

45. The speech coder according to daim 44. wherein said noise estimating means comprises: 

w 

means for determining if it is a nose segment in advance; 

means for comparing said input spectrum obtained by said Fourier transform means with a noise spectrum for 
compensation for each frequency when it is determined as noise; 

means for, when said input spectrum is smaller than said noise spectrum for compensation, setting said noise 
is spectrum for compensation of an associated frequency as an input spectrum to thereby estimate a noise spec- 

trum for compensation; 

means for, when said input spectrum is smaller than said noise spectrum for compensation, setting said noise 
spectrum for compensation of an associated frequency as said input spectrum and adding said input spectrum 
at a given ratio to thereby estimate a mean noise spectrum; and 
20 means for storing said noise spectrum for compensation and said mean noise spectrum in said nose spectrum 

storage means. 

46. The speech coder according to claim 44, wherein said noise canceling/spectrum compensating means multiplies 
said noise cancellation coefficient obtained by said noise cancellation coefficient adjusting means by said mean 

25 noise spectrum stored in said noise spectrum storage means, subtracts a result from said input spectrum obtained 
by said Fourier transform means, and compensates a frequency whose spectrum value has become negative with 
said noise spectrum for compensation stored in said noise spectrum storage means. 

47. The speech coder according to claim 44, wherein said spectrum stabilizing means checks full range power of a 
30 spectrum subjected to noise cancellation and spectrum compensation by said noise canceling/spectrum compen- 
sating means and power of a perceptually important partial band to discriminate if an input signal is an unvoiced 
segment, and performs stabilization and power reduction on said full range power and intermediate power when 
having determined that sad input signal is an unvoiced segment. 

35 48. The speech coder according to daim 44, wherein said spectrum stabilizing means performs random-based phase 
rotation on said complex spectrum obtained by said Fourier transform means based on information indicating 
whether or not said complex spectrum has been subjected to spectrum compensation by said noise cance- 
ling/spectrum compensating means. 

to 49. The speech coder according to claim 44, wherein said spectrum enhancing means has plural sets of weighting 
coeffidents for use in spectrum enhancement prepared in advance, selects a set of weighting coefficients in 
accordance with a status of an input signal, and performs spectrum enhancement using said selected weighting 
coeffidents. 

45 50. A speech coder comprising: 

an excitation vector generator having fixed waveform storage means for storing a plurality of fixed waveforms, 
fixed waveform arranging means for arranging said fixed waveforms read from said fixed waveform storage 
means, at respective arbitrary start positions, and adding means for adding said fixed waveforms arranged by 
so said fixed waveform arranging means to generate an excitation vector; 

a synthesis filter for synthesizing excitation vectors output from said adding means to produce a synthesized 
speech; 

means for measuring distortion of a synthesized speech produced in association with each combination of said 
start positions to specify a combination of said start positions to maximize a measured value while instructing 
55 a combination of said start positions to said fixed waveform arranging means; and 

a noise cancel er for removing a noise component from an input speech signal. 

51 . The speech coder according to daim 50, wherein said noise canceler comprises: 
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A/D converting means for converting said input speech signal to a digital signal; 

noise cancellation coefficient adjusting means for adjusting a noise cancellation coefficient for determining an 
amount of noise cancellation; 

LPC analyzing means for performing linear predictive analysis on a digital signal ol a given time length, 
obtained by said A/D converting means; 

Fourier transform means for performing discrete Fourier transform on said digital signal of a given time length, 
obtained by said A/D converting means to acquire an input spectrum and a complex spectrum; 
noise spectrum storage means for storing an estimated noise spectrum; 

noise estimating means for estimating a spectrum of noise by comparing said input spectrum obtained by said 
Fourier transform means with a noise spectrum stored in said noise spectrum storage means, and storing an 
acquired noise spectrum in said noise spectrum storage means; 

noise canceling/spectrum compensating means tor subtracting said noise spectrum stored in said noise spec- 
trum storage means from said input spectrum obtained by said Fourier transform means based on a coefficient 
acquired by said noise cancellation coefficient adjusting means, checking an obtained spectrum and compen- 
sating for a spectrum of an overr educed frequency; 

spectrum stabilizing means fa stabilizing said spectrum obtained by said noise canceling/spectrum compen- 
sating means and adjusting, of phases of said complex spectrum obtained by said Fourier transform means, a 
phase of said frequency compensated by said noise canceling/spectrum compensating means; 
inverse Fourier transform means for performing inverse Fourier transform based on said spectrum, stabilized 
by said spectrum stabilizing means and said phase spectrum adjusted by said spectrum stabilizing means; 
spectrum enhancing means for performing spectrum enhancement on a signal obtained by said inverse Fou- 
rier transform means; and 

waveform matching means for matching a signal obtained by said spectrum enhancing means with a signal of 
a previous frame. 

52. The speech coder according to claim 51, wherein said noise estimating means comprises: 

means for determining if it is a nose segment in advance; 

means fa comparing said input spectrum obtained by said Fourier transform means with a noise spectrum for 
compensation for each frequency when it is determined as noise; 

means for, when said input spectrum is smaller than said noise spectrum tor compensation, setting said noise 
spectrum for compensation of an associated frequency as an input spectrum to thereby estimate a noise spec- 
trum tor compensation; 

means for, when said input spectrum is smaller than said noise spectrum for compensation, setting said noise 
spectrum for compensation of an associated frequency as said input spectrum and adding said input spectrum 
at a given ratio to thereby estimate a mean noise spectrum; and 

means for staing said noise spectrum for compensation and said mean noise spectrum in said nose spectrum 
storage means. 

53. The speech coder according to claim 51 , wherein said noise canceling/spectrum compensating means multiplies 
said noise cancellation coefficient obtained by said noise cancellation coefficient adjusting means by said mean 
noise spectrum stored in said noise spectrum storage means, subtracts a result from said input spectrum obtained 
by said Fourier transform means, and compensates a frequency whose spectrum value has become negative with 
said noise spectrum for compensation stored in said noise spectrum storage means. 

54. The speech coder accading to claim 51, wherein said spectrum stabilizing means checks full range power of a 
spectrum subjected to noise cancellation and spectrum compensation by said noise canceling/spectrum compen- 
sating means and power of a perceptually important partial band to discriminate if an input signal is an unvoiced 
segment, and performs stabilization and power reduction on said full range power and intermediate power when 
having determined that said input signal is an unvoiced segment. 

55. The speech coder according to claim 51 . wherein said spectrum stabilizing means performs random-based phase 
rotation on said complex spectrum obtained by said Fourier transform means based on information indicating 
whether or not said complex spectrum has been subjected to spectrum compensation by said noise cance- 
ling/spectrum compensating means. 

56. The speech coder according to claim 51. wherein said spectrum enhancing means has plural sets of weighting 
coefficients for use in spectrum enhancement prepared in advance, selects a set of weighting coefficients in 
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accordance with a status of an input signal, and performs spectrum enhancement using said selected weighting 
coefficients. 

57. A speech decoder comprising: 

5 

seed storage means for storing a plurality of seeds; 

an oscillator for outputrjng a vector stream in accordance with a value of seed; 

a synthesis filter for performing LPC synthesis on said vector stream output from said oscillator as an excitation 
vector to thereby produce a synthesized speech; and 
w means for acquiring a seed from said seed storage means based on a seed number included in a received 

speech code and supplying said seed to said oscillator. 

58. The speech decoder according to claim 57, wherein said oscillator is a non-linear digital filter. 

is 59. The speech decoder according to claim 58, wherein said non-linear digital fitter includes an adder having a non- 
linear adder characteristic, a plurality of filter state holding sections to which an output of said adder is sequentially 
transferred as a filter state, and a plurality of multipliers for multiplying a filter state, output from each of said filter 
state holding sections, by a gain and sending a multiplication value to said adder; 

seeds read from said seed storage means are supplied to said filter state holding sections as initial values of 
said filter states; 

said adder have an externally supplied vector stream and said multiplication values output from said multipliers 
as input values and produces an adder output according to said non-linear adder characteristic with respect to 
a sum of said input values; and 

said gains of said multipliers are fixed in such a way that a polarity of said digital filter lies outside a unit circuit 
on a Z plane. 

60. A speech decoder comprising: 
excitation vector storage means for storing old excitation vectors; 

excitation vector processing means for performing different processes on one or a plurality of old excitation 
vectors, read from said excitation vector storage means, in accordance with indices, thereby generating new 
random excitation vectors; 

a synthesis filter for performing LPC synthesis on said excitation vectors output from said excitation vector 
processing means to thereby produce a synthesized speech; and 

means for supplying" an index included in a received speech code to said excitation vector processing means. 

61 . The speech decoder according to claim 60, wherein said excitation vector processing means includes: 

40 means for determining process contents to be applied to old excitation vectors in accordance with said indices; 

and 

a plurality of processing sections for sequentially performing processes according to said determined process 
contents on old excitation vectors read from said excitation vector storage means. 

45 62. A CELP type speech decoder comprising: 

an adaptive codebookfbr storing immediately previous excitation vector information as an adaptive code vec- 
tor; 

a random codebcok for generating a random code vector; and 
so a synthesis filter for performing LPC synthesis of said adaptive code vector and said random code vector, 

said random codebook being constituted of a excitation vector generator comprising seed storage means for 
storing a plurality of seeds, an oscillator for outputting different vector streams in accordance with values of 
seeds, and switch means for switching a seed to be supplied to said oscillator from said seed storage means 
based on a seed number included in a received speech code. 

55 

63. A speech decoder comprising: 

an excitation vector generator having fixed waveform storage means for storing a plurality of fixed waveforms, 
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fixed waveform arranging means for arranging said fixed 'waveforms read from said fixed waveform storage 
means, at respective arbitrary start positions, and adding means for adding said fixed waveforms arranged by 
said fixed waveform arranging means to generate an excitation vector; 

a synthesis filter for synthesizing excitation vectors output from said adding means to produce a synthesized 
speech; and 

means for instructing a combination of start positions included in a received speech code to said fixed wave- 
form arranging means. 

64. A CELP type speech decoder comprising: 

an adaptive codebcok for storing immediately previous excitation vector information as an adaptive code vec- 
tor; 

a random codebook for generating a random code vector; and 

a synthesis filter for performing LPC synthesis of said adaptive code vector and said random code vector, 
said random codebook being constituted of a excitation vector generator comprising fixed waveform storage 
means for storing a plurality of fixed waveforms, fixed waveform arranging means for arranging said fixed wave- 
forms read from said fixed waveform storage means, at respective arbitrary start positions, adding means for 
adding said fixed waveforms arranged by said fixed waveform arranging means to generate an excitation vec- 
tor, and means for instructing a combination of start positions included in a received speech code to.said fixed 
waveform arranging means. 

65. The CELP type speech decoder according to claim 64, further comprising: 

a second random codebook for generating random code vectors; and 

selecting means for selecting one of said random codebook and said second random codebook. 

66. The CELP type speech decoder according to claim 65, wherein said second random codebook is vector storage 
means having a plurality of random number streams stored. 

67. The CELP type speech decoder according to claim 65, wherein said second random codebook is pulse stream 
storage means having a plurality cf pulse streams stored. 

68. The CELP type speech decoder according to daim 65, wherein said second random codebook has a same struc- 
ture as said excitation vector generator, and a number of fixed waveforms to be stored in said fixed waveform stor- 
age means differs from said random codebook. 
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FIG. 2A 
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FIG. 9 
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FIG. 15 
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